Previous Entry Share Next Entry
Grouping and storing personal information in Internet as in a global 'Cloud'

Status of This Memo
 This document proposes a method to use public Internet as a global infinite
 storage for personal information and a way to group and separate a public
 personal information from other information.
Copyright Notice
 Copyright (C) The Internet Society (2005).

         This document and translations of it may be copied and
         furnished to others, and derivative works that comment on or
         otherwise explain it or assist in its implmentation may be
         prepared, copied, published and distributed, in whole or in
         part, without restriction of any kind, provided that the above
         copyright notice and this paragraph are included on all such
         copies and derivative works.  However, this document itself may
         not be modified in any way, such as by removing the copyright
         notice or references to the Internet Society or other Internet
         organizations, except as needed for the  purpose of developing
         Internet standards in which case the procedures for copyrights
         defined in the Internet Standards process must be followed, or
         as required to translate it into languages other than English.

         The limited permissions granted above are perpetual and will
         not be revoked by the Internet Society or its successors or

         This document and the information contained herein is provided



 Grouping and storing personal information in Internet as in a global 'Cloud'                

V.Gavrilov                               June 2009

 The Internet today (so called WEB2) is more and more used for decentralized
 storing and editing of articles, comments, blogs, bulletins, wikis which are
 hosted on different public sites. But there
 is no a standard way of extracting all the information published by a user,
 grouping and searching of such information, an easy way of separating
 ones personalized information from the miriads of published articles by
 other users.
 This grouping exists only in editor's head as personal associations
 and memory. With time passed by - some publishings are lost, URLs and
 domains may change and boookmarks are expiring, so the only way to find
 one's published information is to search the Internet again and ... try
 to find one's information by hand - from millions of results returned by a
 search engine, narrowing requests and searches.
 Proposed here is a simple method of grouping personal information by attaching
 of a signature to every published message where the signature is a short
 result from a one-way hash-function generated from a combination of a user's
 name, date of birth and a personalized message - to avoid collisions.
 To avoid confusion with the term "digital signature" from asymmetric
 cryptography - let's from now on name our signature/hash as NID (Network ID).
 A scenario could look like the following: while browsers still not supporting
 NID - a user could temporarily copy/paste NID during every personal post.
 In the future - user will be able to query a favourite search engine for a NID and will get
 only his personal stuff - been separated from the rest of Internet information
 because NID forms a word generated from a personal unique information which
 will highly unlikely occur somewhere else.
 In future - browsers or search engine could support a checkbox: "Personal/Public"
 and selecting of this checkbox - will allow to extract only the personal
 information from the Internet, the user ever published (using this NID).
 This is a simplest scenario. More complex usages of different groupings of
 personal information (forming multiple groups inside one personalized group by
 means of attached to the NID keywords) - may be easily shown.
 Let's consider a simple implementation of such NID generator.
 It is obvious that the compactness of NID is highly desired. And it seems
 that for such purpose - even 128-bit MD5 algorithm can be used successfully.
 (Actually, the published vulnerabilities of MD5 on collisions will not have
 impact on this particular usage due to the predefined and short string, from
 which the key is generated and secondly - even if a collision will occur - it will
 not have a big impact, so will not be such critical). From another side -
 22-char string hash result (base64 encoded 128-bit binary output from MD5) is
 short enough - to be stored extensively in the Internet in every post or article.
 Generating of a personalized NID may be hypotetically demonstrated by the following
 UNIX command * (it will be actually longer that 22-char since the standard built-in
 md5 uses less compact than base64 encoding: HEX):
 $ echo "Vasili Gavrilov 01011968 my cat's name - Kuzma" | md5
 $ XQgYnkgYSBwZXJzZXZlcm

 where "my cat's name - Kuzma" is a seed/salt, added for avoiding collisions
 of multiple persons having the same name and were born on the same date and also -
 to avoid generating of the NID by another person - to extract somebody else's
 information (if the name, DOB and protocol of NID generating are known).

 What should be noted here is that there should be at least minimal protocol of
 what fields are to be used and in which sequence - for generating NID - to avoid
 collisions by using of too simple feeds into md5+base64 combination.
 This RFC is intended for begining of discussing of this convention.
 For example - the protocol could require to write first name, last name,
 Date of Birth and "salt" - in this order (in any case, with any delimiter or
 vice versa - with predefined delimiter and casing - TBD. Benefits of that?).
 An extension of the protocol could be an attaching of a personal keyword or
 an association - to the NID.
 For example, when storing something connected with photo - the user could attach
 "photo" at the end of the NID:
 and in future - searching the Internet for this string will give a user all
 the entries ever stored with this key.
 Storing of multiple keys NIDs with multiple keywords will allow to create
 arbitrary groups and search engines will do their regular job for intersecting
 of the groups.
 What should be noted here is that it will be hard for another person - to get
 somebody else's information due to irreversibility of the hash function and
 existence of the 'salt' acting as a 'password'.
 No one is restricted to use more than one NID, so this is very different
 from assigning of NID to every user forever and so - this seems to be very
 privacy-friendly approach also.
 In future - browsers (or search engines) may support transparent appending of NID
 to the requests and searching for the past personal postings connected with "photo" and
 "vacation" will be able to achieve by just entry in a browser:
 "photo vacation" - as it is done currently against common data.
 Since above-mentioned checkbox "Personal" will be checked-in - a browser will attach
 locally saved (or saved in a Cookie or a session) NID and will send a more restricted
 request returning only the user personal postings (from multiple sites) containing
 both keywords "photo" and "vacation".
 We could imagine other extensions such as attaching of a counter or
 another id at the end - to allow saving of redundant (the same) data into
 multiple sites and for easier distinguishing of the duplicated data in the browser.
 This can be further elaborated.
 The above-mentioned procedure allows to use public internet as an infinite storage
 of personal data and easy extraction and grouping of such data and separating
 of public data and data saved by a person. This also transforms saving of the data
 into the Internet into saving into one global 'Cloud' and abstracts the location
 (URL may change but the information will still remain searchable).
 This will allow the personal data to be distributed equally on multiple public
 storages and in future - possibly to organize personal distributed services, working
 with personal data in really parallel way.


*)  A reference tool for generating such signature is here:

  • 1
  • 1

Log in