Help page: Edit

Editing Help:WebArchive Notes

Warning: You are not logged in. Your IP address will be publicly visible if you make any edits. If you log in or create an account, your edits will be attributed to your username, along with other benefits.

The edit can be undone. Please check the comparison below to verify that this is what you want to do, and then save the changes below to finish undoing the edit.

Latest revision Your text
Line 1: Line 1:
 
=Web Archive=
 
=Web Archive=
The Internet Archive (also known as the Wayback Machine) [https://archive.org/web/ https://archive.org/web/] is an archive of various websites that started in 1996.  As many know and have experienced, information available through the web is not static.  It is not uncommon for business websites to be reconfigured periodically and old information removed to improve its utility for its current customers.  Also as companies themselves may close or reorganize their websites can change or disappear.  Individually run websites that serve as long standing information caches can also experience change or disappearance.  However, the internet archive can help retrieve information that was available on a certain website at an earlier point in time.
+
The Internet Archive (also known as the Wayback Machine) [https://archive.org/web/] is an archive of various websites that started in 1996.  As many know, information available through the web is not static.  It is not uncommon for websites run by businesses to be reconfigured periodically and old information removed to improve its utility for its current customers.  Also as companies themselves may close or reorganize their websites can change or disappear.  Privately run websites that serve as long standing information caches can also experience change or disappearance.  However, the internet archive can help in finding some information that was available on the web at an earlier point in time.
  
This is a useful tool for research, but some cautions are worth noting.  The Internet Archive was (and continues to be) built by taking periodic yet selective snapshots of websites via bots that crawled the web.  (Archival practices are influenced by the archive's available storage and bandwidth but also by the technical aspects of the capture process, which itself has developed over the years.)  This means that its coverage can be affected by the following issues.
+
This is a useful tool for research, but some cautions are worth noting.  The internet archive was (and continues to be) built by taking periodic yet selective snapshots of websites via bots that crawled the web.  (Archival practices were influenced by the archive's available storage and bandwidth but also by the technical aspects of the capture process, which itself has developed over the years.)  This means that its coverage can be affected by the following issues.
 
# Its snapshots are discrete, not continuous.  Thus information that changed rapidly (and so changed between snapshots) would not be captured.
 
# Its snapshots are discrete, not continuous.  Thus information that changed rapidly (and so changed between snapshots) would not be captured.
 
# Its snapshots are selective.  The web crawlers might not navigate through every subpage and link on each pass meaning that some subpages might not be archived or archived with less frequency.
 
# Its snapshots are selective.  The web crawlers might not navigate through every subpage and link on each pass meaning that some subpages might not be archived or archived with less frequency.
 
# Some webpages had set-ups that obstructed their being archived, whether intentionally or unintentionally.
 
# Some webpages had set-ups that obstructed their being archived, whether intentionally or unintentionally.
# Information behind an authorization wall (i.e., requiring log in) could generally not be accessed by web crawlers and so would not be archivable.
+
# Information behind an authorization wall (i.e., requiring log in) could generally not be accessed and so would not be archivable.
 
# Search features by and large do not operate within archived pages.  [The Internet Archive captures the front-end, i.e. various pages of the website, but not any back-end servers or databases which would be needed for certain operation like search.]
 
# Search features by and large do not operate within archived pages.  [The Internet Archive captures the front-end, i.e. various pages of the website, but not any back-end servers or databases which would be needed for certain operation like search.]
 
# Some websites (such as web stores) populate certain pages by a type of query or search from their database.  This would yield several pages of results that could be paged through by a user in normal usage.  The Internet Archive may capture the first page for inquiries coming from a built in link, but capturing the second or additional pages, while possible, tends to be less frequent.
 
# Some websites (such as web stores) populate certain pages by a type of query or search from their database.  This would yield several pages of results that could be paged through by a user in normal usage.  The Internet Archive may capture the first page for inquiries coming from a built in link, but capturing the second or additional pages, while possible, tends to be less frequent.
 
# Images were captured at times, but not always.  Also downloadable files likewise were sometimes captured, but not always.
 
# Images were captured at times, but not always.  Also downloadable files likewise were sometimes captured, but not always.
  
An interesting consequence of points 6 and 7 is that information presented as plain text is more likely to have been archived than information that was bundled into a pdf download or arranged using a more sophisticated or interactive user interface.
+
An interesting consequence of points 6 and 7 is that information presented as plain text is more likely to have been archived than information bundled into a pdf download or arranged using a more sophisticated or interactive user interface.
  
 
Following links in old pages shown in the Wayback Machine generally leads to archived versions of the linked pages.  However the capture date for the linked page may differ slightly or even greatly from the capture date of the starting page.  Also some links will redirect to an archived according to the redirect that was captured.  But some links will yield a 'non-archived URL' error page.
 
Following links in old pages shown in the Wayback Machine generally leads to archived versions of the linked pages.  However the capture date for the linked page may differ slightly or even greatly from the capture date of the starting page.  Also some links will redirect to an archived according to the redirect that was captured.  But some links will yield a 'non-archived URL' error page.

Please note that all contributions to BattleTechWiki are considered to be released under the GNU FDL 1.2 (see BattleTechWiki:Copyrights for details). If you do not want your writing to be edited mercilessly and redistributed at will, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource. Do not submit copyrighted work without permission!

To edit this page, please answer the question that appears below (more info):

Cancel Editing help (opens in new window)

Advanced templates:

Editing: {{Merge}}   {{Moratorium}}   {{Otheruses| | | }}

Notices: {{NoEdit}}   {{Sign}}   {{Unsigned|name}}   {{Welcome}}

Administration: {{Essay}}   {{Policy}}   {{Procedure}}

This page is a member of 1 hidden category: