Wednesday, August 22, 2007

Windows Date Created Timestamp strangeness

I have been using Windows since version 3.0 and thought  I had seen most of its subtleties. However today I found a new "gem" I had not encountered before.

We use two Perl scripts to do some FTP transfers regularly, scheduled by the Windows "Scheduled Tasks" to run several times a day. The scripts both use a common function to append to their respective daily log file. In case a file is older than 5 days, judging by its age in days based on the "Date Created" timestamp, it will be deleted and a new one created under the same name. This is intended to not have the files grow indefinitely.

While one of the jobs worked just fine, appending to its log files and rotating after 5 days, the other seemed to overwrite its log file on each run. Strangely enough  - as said before - they both used the same log function, just with different file names.

After some poking around in the scripts source we decided to make a more low level test to see whether this had anything to do with either Perl or the "Scheduled Tasks". We created a file from the command line by just redirecting an echo command. We then deleted the file and recreated it again by the same means. This is what we got (German Windows XP, but behavior observed was equivalent to Windows Server 2003 and Windows Professional 2000):

C:\quantum>time /t

C:\quantum>echo "quantum" >> mechanics.txt

C:\quantum>dir /T:C mechanics.txt
 Volume in Laufwerk C: hat keine Bezeichnung.
 Volumeseriennummer: 0CA5-E6F2

 Verzeichnis von C:\quantum

22.08.2007  22:46                 8 mechanics.txt
               1 Datei(en)              8 Bytes
               0 Verzeichnis(se),  8.096.706.560 Bytes frei

C:\quantum>time /t

C:\quantum>del mechanics.txt

C:\quantum>echo "temporal" >> mechanics.txt

C:\quantum>dir /T:C mechanics.txt
 Volume in Laufwerk C: hat keine Bezeichnung.
 Volumeseriennummer: 0CA5-E6F2

 Verzeichnis von C:\quantum

22.08.2007  22:46                 8 bar.txt
               1 Datei(en)              8 Bytes
               0 Verzeichnis(se),  8.096.706.560 Bytes frei


Notice the experiment starts at 22:46. A file is created and its creation timestamp is automatically set to 22:46 (dir /T:C shows the creation time). Then, at 22:47 the file is deleted and recreated with different content. Again the file creation date is 22:46 while the "Date Modified" timestamp is 22:47.

Looking at the timestamps of the log file mentioned above we noticed that the "Date Created" attribute was back several months. Apparently every time the file had been deleted and recreated the original timestamp had been restored as well.

Looking around the net for some explanation for this (we suspected a filesystem bug) after a fair amount of searching I finally came across "The Old New Thing: The apocryphal history of file system tunnelling" and through that Microsoft Knowledge Base article 172190, titled "Windows NT Contains File System Tunneling Capabilities".

Basically what happens is that Windows caches the timestamp of the deleted file for some time. In case a new file appears under the same name, it will get the cached value. The default maximum time period between deleting and re-creating the file is 15 seconds. However we could also see it happen with more than 8 minutes! I do not yet understand how this comes about...

Once I knew the keyword was "tunnelling" it was also easy to find this 2001 post from a discussion on Bugtraq. I completely agree with Ken Brown who wrote that post (especially concerning the 2nd paragraph):

"Tunnelling" is a long way from any keywords that I'd associate with file systems - and a search for "tunnelling and ntfs" turns up a great many references to VPNs and bits of networking. It now turns out that it isn't really a property of the file system at all, which obviously makes the search even harder. 
Obviously not serious, but I bet that someone, somewhere, has an application that depends on file creation dates and wonders why it goes wrong every now and again. That is a *mild* potential security problem, if only because it could cause confusion. Documentation bugs can be security problems. Unexpected or unwanted behaviour from a machine is always a potential security problem.
The accumulation of seemed-like-a-good-idea-at-the-time backwards-compatible gotchas in the Windows file systems [...] all combine to introduce uncertainty and unpredictability, which leaves gaps for security errors.

Because one of our scripts took more than 15 seconds (the default time-to-live for cache entries) between deleting and recreating the file as it was scheduled during a high load time, it literally took advantage from the lack of resources.

The other script, running during far less busy periods had always been fast enough to trigger the tunnelling "feature" and get the cached creation time over and over again.  Of course the workaround for the scripts is simple: Consider the "Date Modified" attribute.

A more drastic approach would be to change the registry settings controlling this behavior. See the knowledge base article for details. 

But seriously - I will never understand (and even less appreciate) - why Microsoft tends to always choose backwards compatibility over reliability. Maybe this was a sensible feature right when long filenames on FAT volumes came about, but seriously, is it necessary to carry this all the way to the 2003 server? I bet it is still present in Vista as well....


csours said...

Thank you very much for the links. I had a heck of a time finding this information; If you hadn't written this up, I may have given up finding an answer. This tunneling is just another instance of 'Windows Magic', that is, Windows Will Do IT For You (what it is and if you want it to happen notwithstanding). Anyway, thanks.

faiz said...

I am not sure if things have changed since your original post, but the file in your example show "bar.txt" even though you had apparently asked for "mechanics.txt" ???

I reached your post trying to decipher another strange behavior coming from WinXP. It apparently reports some kind of cached timestamps for even the modified date (looking at properties of the file form the right-click menu) and a second request without any changes gives back an updated value.

You post indicated that requesting for the "modification date" might solve the issue you noticed, but may be not.

However I am working on a MacFusion session of WinXP with the file in question actually residing on the Mac OS. Maybe that has something to do with it?

Joe said...

Well what do you know ... I was playing with some C# code that tests simultaneous file writing/reading and ran into the "tunneling" effect while looking at file creation times. It's August 2010 and I am convinced that twenty years from now someone will find your blog entry when they stumble across the same "feature" on Windows Server 2030.

Ondro said...

True story, same problem experienced on Windows 8.1 Pro N

Anonymous said...

Same with Windows 10 64 Pro