User Login:  
Forgot your login? Click here.
  
Newsgroups Search:
     
 

Usenet Information Center : Articles

 
 
 
High-speed access to thousands of newsgroups for as low as
$2.95

per month
Sign up today!



 
 

Topics

How-To Articles

How to Configure Your Newsreader
 

Tutorial: Binary Files From Usenet
 

Usenet Binary Files - why is this so complicated?
 

Tutorial: Handling Video files from Usenet
 

Ins and Outs of Newsgroup File Formats
 

How To Get The Most From Usenet
 

How To Advertise On Usenet
 

How To Write For Usenet
 

All About Usenet

History of Usenet Newsgroups
 

A History of Some Usenet Rules
 

How is Usenet Organized?
 

The Searchable Usenet
 

Regional Usenet Newsgroups
 

Watch Out For "Warez" From Usenet
 

Usenet Text Newsgroups as a Research Tool
 

Why Usenet is Your Better Download Solution
 

History Through the Eyes of Usenet
 


Tutorial: Binary Files From Usenet

What are all these different files? What do I do with them? Why is this so complicated?

Split files
.001, .002, .003, etc. files
Archives
.zip files and other similar archives
.rar files (and .rxx)
Parity correction files (often referred to as Parchive files)
.par files (and .pxx)
.par2 files
CD Images
.bin/.cue and other CD image files
Video Files
.dat files and .vob files
Extra files you'll often see
.nfo and .diz files
.nzb files

When you want to download binary files, you might think you are just going to get .mp3 files for sounds, .jpg and etc. for pictures, .mpg, .avi, .wmv, etc. for video files, and so on - but in order to manage large posts, almost anyone who posts files will archive the files into sets of files, all nearly the same size, that contain the file or files that you want. So downloading the binary files becomes a multi-step process. Your newsreader software will hopefully do some of the heavy lifting for you, downloading individual encoded posts and combining them into the target binary files - but there are still some steps that usually have to be taken before you are done.

First step - know what you are looking at! In Windows 95 and later Windows versions, it became fashionable to hide files and extensions from the user of the computer, so once you download files, it's difficult to tell sometimes exactly what kind of file you are looking at. You really want to be able to see and work with the file extensions - so if you are using Windows, perform the following steps if you have not already. First, choose My Computer >> Right-Click and choose "Explore". In the Windows Explorer, choose Tools >> Folder Options >> click on the "View" tab. Deselect "Hide File Extensions for known types". Click on the "Like Current Folder" button to make this setting global. Now you will be able to see and edit the extensions of the files you download.


Tools>>Folder Options>>View Dialog

In many newsgroups, the files are simply posted in their original format (.mp3, etc.), and all you have to do is download and enjoy them. Just double-click on the file and it will open in Windows Media Player or whatever application you use for various media. As you start looking at more newsgroups with different kinds of files posted to them, you'll start seeing files that are in strange formats and you may wonder what to do with them. Some files are archives or file containers that actually include what you are looking for. Others are files that you can load into various utilities to make sure that the binaries you want are complete and have not been corrupted or damaged in transit - and in some cases, even repair the damaged or missing files. Others are files that describe the binaries being uploaded, or even describe the upload in such a way that your newsreader can load one file and have a "recipe" for downloading the entire set of posted files. Here is a discussion of many of the most common files you'll encounter in the binaries newsgroups on Usenet, and some tips and tricks for dealing with them, and where possible, some links to utilities and helper programs that will assist you.

Split files
.001, .002, .003, etc. files
These are simply large files that have been split into pieces for easier handling during upload and download. Often they are very large MPEG or other files that are already in a highly compressed form, so that further compression isn't needed. To recombine them, the binary files just need to be "added" back together in the order that the split parts were taken from the original file. This can sometimes be done with a simple DOS command or batch file:
copy /b file.mpg.001+file.mpg.002+file.mpg.003 file.mpg

Note that this method uses "copy /b" for binary copy, not simply "copy". Now, imagine that you have a file split into 115 parts - that would be a lot of typing. Much better to use a utility made for the purpose! Two of the most popular utilities are HJSplit (http://www.freebyte.com/hjsplit) and Mastersplitter (http://www.tomasoft.com). These utilities will allow you to rejoin split files like this, and to split files if you need to. They also include many features that will help make sure that you are not missing any parts and that the parts go together in the right order.



HJSplit Utility showing a set of files selected to be joined. Just select the ".001" file and the rest will be located automatically if they are in the same directory.
.zip files and other similar archives
This is a common kind of archive format that most people are familiar with. In Usenet newsgroups, you will often see sets of small files bundled together into .zip archives and uploaded then as a single file. Often programs and utilities that are downloaded from websites will also be archived together as a .zip file. The .zip files not only collect a number of files together for easier downloading, they also compress the file data so that it takes less time to download. Other similar but less used formats are .arc files (an older DOS type with versions by Thom Henderson's SEA and Phil Katz' PKWare - as a side note, a fascinating piece of hacker/net history from the BBS days is the story of the ARC format, the fight between Henderson and Katz, the subsequent birth of ZIP, and Katz' tragic death in 2000 - I met Katz at COMDEX in '86 or '87 and found him to be an insanely intellegent guy. Just Google "Phil Katz" for many interesting articles from both sides of the argument.), .gz and .tar files (common UNIX types), .arj files (an alternative to ZIP with file splitting, encrypting, and advanced validity checking), .lzh files (a simple public domain compression), .sit (Macintosh Stuffit), and .cab files (usually used for installation packages). There are a number of utilities that can open and extract files from these various kinds of archives.

Winzip is a utility that runs under most versions of Windows, and can extract from all of the formats listed above except .sit (Stuffit) files. It is available as a try-before-you-buy download at http://www.winzip.com. The originator of the ZIP format is PKWare (Founded by Katz mentioned above), and their utilities can be found at http://www.pkware.com.

For Linux and Mac users, Zip utilities can be found at http://www.info-zip.org. At ArjSoft's website,  http://www.arjsoft.com you'll find utilities for .arj files for use with DOS/Windows environments, and Aladdin Software (http://www.aladdinsys.com) makes utilities for Macintosh, Unix, Solaris, and Windows that handle not only .sit archive files, but .arj, .zip, and many other formats as well.

.rar files (and .rxx)
I'm discussing .rar files separately because by far, it is the most widely used format on Usenet for distributing binaries. There are a number of fine points that I'd like to talk about in managing RAR files as well, not that there are more complications or difficulties with the RAR format than any other, but it is so widely used and you'll see so many versions and techniques used to create the RAR files that you may see more of the associated problems and it is a good file type to know well. The best software to use for creating, combining, and uncompressing RAR files is WinRAR. It is available for Windows, Macintosh, UNIX, Linux, and PocketPC at Rarlab - you can download try-before-you-buy versions at their website. (http://www.rarlab.com).

There are two kinds of RAR file sets that you'll commonly run into. RAR file sets created with an older version of WinRAR (before version 3.0) have filenames that will look somewhat like this: filename.r00, filename.r01, filename.r02, etc., with the last file in the set named filename.rar. Newer versions of WinRAR (version 3.0 and above) create file sets with names that look like this: filename.part01.rar, filename.part02.rar, filename.part03.rar, etc., counting up to the total number of files in the set. Newer versions of WinRAR can create the "old" style filenames, but usually this isn't done.

The most important thing to remember when using WinRAR is to make sure that you always have the latest version if you are downloading RAR files. Most of the problems that people have with files being corrupt or unusable come from attempting to open a set of RAR files in a version of WinRAR that is older than the version used to create the RAR files. This is also one of the first things to check (after running a PAR or SFV validation as outlined below) if you attempt to open a set of RAR files and get errors saying that the compression isn't valid or other errors of that type.

WinRAR will also give you warnings and errors if any of the parts are corrupt or missing from having been uploaded or downloaded improperly. Sometimes the file errors are negligible, and you can actually still use the files - by checking "keep broken files" in the UnRAR options, you can save the file even though it's damaged, and see if it's still usable.

If the RAR files are loaded out of order - say you doubleclicked on the ".r04" file instead of the ".rar" file, or the "part04.rar" file rather than the "part01.rar" file - it's possible to accidentally unRAR only part of the files in the archive. Sometimes in this case you will be prompted with options including "extract all files and folders from the current" - always click on that option in that case, but it's safest to simply make a habit of opening the RAR file set by clicking on the ".rar" or "part01.rar" file to make sure the set is completely loaded.

Once the file set is open, the dialog will show all the files in the archive. Select them all, right-click them, and choose "Extract to the specified folder", and choose the folder that you want to save the file(s) to. If you type in a folder name that does not exist, it will be created.


WinRAR with a set of RAR files loaded, ready to extract. This is the same set of files used in the PAR and PAR2 examples below.

But what if the archive files are corrupt or incomplete? Often, almost always with experienced and considerate posters, there will be, along with the posted files some recovery files. The first thing to do is to download the files or archives themselves. If there is a file with the extension ".sfv" in the post, you can download it to test the download to see if it's complete and undamaged. To use these files, get one of several utilities to read them including cSFV (freeware) at xxx, or QuickSFV from xxx. The SFV files are quite small, so they are easy and fast to download. Just open the SFV file, and the program will test the downloaded files and notify you of any that are missing or damaged.

If any of the files are missing or damaged, they can be repaired using any PAR or PAR2 files that were included with the post. The way that these files work uses the same technology as recoverable disk drive volumes using "Parity" records (hence the "par" designation).

Parity correction files (often referred to as Parchive files)
.par files (and .pxx)
This type of PAR set based on the PAR1 specification consists of on file with an extension ".par" which is a small text file used to check the files and define the set in much the same way that an SFV file is used, and a number of files with the extensions .p01, .p02, .p03, etc. (I'll call these collectively pxx files) which are the actual parity data files. Each of the pxx files is just a bit larger than the largest file in the set of files to be repaired. To use these files, you will need one pxx file for each file that you need to replace or repair. Open the par file with any of several utilities - FSRaid, SmartPAR, or QuickPAR. Sourceforge (http://parchive.sourceforge.net) has both technical information and links to download all of these clients. My preference in PAR1 software is Fluid Studios' FSRaid.


Set of downloaded RAR files with PAR files for repair. Note that "testfile.part04.rar" is missing!

Once the files are checked, the software will allow you to perform the repairs. The repaired files are completely compatible with the originals, there is no loss of data or quality by repairing using this method. The method I usually use is to download just the ".par" file and the files I want, then use FSRaid to test the files. If any repairs are necessary, it will tell you how many pxx files you will need - go back to the newsgroup and download just the number you need, if any, and you don't waste download megabytes getting unnecessary files.


Once you've loaded the .par file in FSRaid, if there are enough pxx files available, it will automatically perform the repair. Excellent!

.par2 files
The PAR2 specification has several advantages. The PAR2, instead of requiring a complete replacement file to repair an incomplete source file, divides the entire post into a set of smaller blocks. So fewer files are required to effect repairs - although for missing files you will need the same amount of PAR2 files in megabytes as the missing files. If your news reader allows you to download and combine incomplete files for which all the posts aren't available, downloading as much of them as you can, incomplete if necessary, will allow you to repair the posts with PAR2 files with a relatively small number of the PAR2 files compared to the amount of PAR files you would need to download to do the same thing.


The same set of RAR files, put with PAR2 files for repair this time. In this example, "testfile.part04.rar" was downloaded incompletely - note the file size.

To use PAR2 files, you will need to use software that can work with them such as QuickPar (see above). When it checks the files you've downloaded, it will check each file part by part rather than as a whole, and even partial files can be used along with the blocks from the PAR2 files to create complete files. Where with "old" PAR files if a files was incomplete or corrupt it might as well be missing completely, with the PAR2 files you can use partial or partially corrupted files in the repair. The only disadvantage is that it's slower, and the PAR2 files take longer to create, but the reduction in the amount needed to download to repair damaged files makes up for it.


After loading the PAR2 file with QuickPar, the testing shows that the file was incomplete, and that we have enough blocks to recover it.


After the repair is done, simply exit and open the RAR archive as usual. Note that it kept a copy of the damaged / incomplete file, with a ".1" extension tacked on to the end. Usually these can simply be deleted, but it may be useful in some rare cases. Once in a while QuickPAR will repair the file, but it actually shouldn't have been repaired - then WinRAR says that the repaired file is corrupt! In that case, just delete the repaired file and rename the ".1" copy back to ".rar", and run WinRAR to see if that fixes it. This has only happenned to me once or twice in the past year or two since PAR2 became widely used.

OK, you've downloaded the files, and unRARed them, now what? Sometimes after completing a download you'll wind up with files that you were not expecting - and don't know what to do with. Different people post things in a number of different ways, depending on what software they commonly use and what is convenient for them. Always remember the three rules of Usenet binary newsgroups -
1) No One Owes You Anything.
2) There Ain't No Such Thing As A Free Lunch.
3) See The First Rule.
What I'm trying to get at is that if someone posts something you want, but it's in a format you are not familiar with or that's not your first choice, it's up to you to get the utilities or tools to work with their files, not the other way around. If you try to persuade the posters in a newsgroup to change the way they do things to make it more convenient for you, the likely result is that no one will pay any attention, and many will simply "killfilter", or block, any messages you post to the group as annoyances. Here are some tips and pointers that might help you work with files that are puzzling you.

CD Images
.bin and .cue and other image files
When you unpack the RAR files all you get is a great big .bin file and a little tiny .cue file! What's wrong? BIN/CUE files (they are a set, one of each to a CD) are CD Image files, they most commonly used CD Image format on Usenet. Rather than upload the contents of a CD separately, as a bunch of files, the poster has uploaded an image for burning a complete CD that exactly duplicates a CD they've made or obtained. Other formats for CD images include DAO (Duplicator), TAO (Duplicator), ISO (Nero, BlindRead, Easy CD Creator, many others), IMG (CloneCD), CCD (CloneCD) , CIF (Easy CD Creator), NRG (Nero), C2D (WinOnCD), CDI (DiscJuggler), PXI (PlexTools), MDS (Alcohol 120%), MDF (Alcohol 120%), VC4 (Virtual CD), BWT (BlindWrite). There are several reasons why a poster might choose to upload in this format rather than simply uploading the files. In some cases the CD has special attributes, such as being bootable, that are preserved in this kind of upload. The CD may be a Video CD (VCD or SVCD) or a DVD that the author wants to upload with the menus and chapters setup intact rather than just uploading MPEG files. Or it may simply be a matter of convenience for the poster to manage files in this way. They are the poster, so it's up to them to decide, even if it's a little inconvenient for you. No worries though, you can deal with these files if you want with a few simple utilities.


A BIN/CUE fileset - The BIN file contains the data, and the CUE file is a description for how a CD burner program should handle it.

You have a couple of options with files of this type. One is simply to go ahead and burn the CD, the other is to "Open" the image file, and "Extract" the files from it, like copying files off a real CD.

In order to burn the image to a CD you may need just the right software for the image type. Many of the common formats, ISO for example, are supported by many CD burning programs, but some are not. Here are some guidelines: For "ISO" images you can use several programs. This is an open-standards format, like the ISO-9660 file format that's used to burn the CDs, so it's built into a lot of CD burning software. For others, you may have to use the software that created it, or open the file and extract the contents as outlined below. Easy CD Creator (http://www.roxio.com) can work with ISO and CIF files. Nero Burning ROM (http://www.nero.com) can work with several, including ISO, BIN/CUE, NRG, and others. Alcohol 120% (http://www.alcohol-software.com/) can work with ISO, MDS, MDF, and others. A neat little utility that I often use is Blindwrite (http://www.blindwrite.com), which can burn CDs from ISO, BIN/CUE, and BWT files.


Select the CUE file with Blindwrite - it will load the BIN file automatically


Insert a disc, and Blindwrite will make you an exact copy of the original CD

If you simply want to open the CD image file and look at the files or copy them to your hard disk, you can use a utility such as Isobuster (http://www.smart-projects.net/isobuster). It allows you to open many different image file formats including all those listed above. Simply open the ISO file and you will be presented with the directory structure of the CD as if it were open in Windows Explorer - you can copy files and directories from the CD image to your hard drive.


Open a BIN file with ISOBuster - it does not need the CUE file. Right-click on selected files to get extract options. 

Video Files
.dat files and .vob files
If the CD image is a Video-CD (VCD or SVCD) or DVD, it may not be apparent how to get at the actual video files. If you just want an MPEG file that you can view on your computer or convert to a different format, and not burn to a disc, you'll need to extract the files containing the video, and you may need to "convert them a little" before they are useful. On a VCD/SVCD the actual video will be in MPEG-1 (VCD) or MPEG-2 (SVCD) format, and will be in the "MPEGAV" directory, named something like "AVSEQ01.DAT", or "AVSEQ01.MPG". there may be several files of this type, i.e. AVSEQ01.DAT, AVSEQ01.DAT, AVSEQ01.DAT, etc. - each of the files being a different segment of the disc's video.

Some may simply be intros, credits, etc., and some (usually the largest files) will be the content you are interested in. There may be other directories on the disc as well, depending on how it was authored, such as "EXT" or "SEGMENT" that may contain content - look in all of the folders for files named "*.DAT" or "*.MPG" to get all the files that may be content you want. Usually, to watch these files and see what's in them, if they are named *.MPG you can simply double click them. If they are named *.DAT you can rename them, changing the DAT extension to the MPG extension, and you can doubleclick and play them in Windows Media Player or whatever you have set as the default for MPEG files.

Sometimes, however, this doesn't work quite that simply. It usually will, but in the process of creating the VideoCD, some software can change the MPEG file's format slightly when writing the DAT file (especially if the DAT file was made up of more than one MPEG file joined together in the VCD creation process) - in this case you'll need to repair or re-rip the DAT file. There are several utilities that can help. One thing to try is extracting the DAT file from the VCD image with ISOBuster using the "Extract but FILTER only M2F2 mpeg frames" option. (Right-click on the filename, and choose that option instead of simply "Extract") This will convert any frames in the MPEG file that are causing errors and will simply copy the rest of the file normally. Visually you may see some blurring when playing the problem frames during playback but it's generally hardly noticeable. Another utility to try is VCDGear (http://www.vcdgear.com - freeware, but deserving a donation if you use it!) - load the DAT file and choose the "Fix MPEG Errors" Option and save it to MPEG. If the DAT file contains multiple MPG files, each will be created separately. VCDGear can also read MPG files directly out of several different CD Image formats. Very useful.


Convert a DAT file to mpg file(s) with VCDGear

In the case of a DVD image, you will find VOB files (VOB stands for Video OBject). These are a little different format than MPG files you might be used to as they can contain multiple titles and streams. The way to deal with them is to use VCDGear after renaming the VOB file to MPG, selecting "mpg --> mpg" and "Fix MPEG Errors". It will create one or more mpg files based on the contents of the VOB file. If the VOB file is encrypted (i.e. from a commercial DVD), these methods won't work, and it is currently illegal to distribute tools for removing that copy protection.

Extra files you'll often see
.nfo and .diz files
These files used to have a set format and were used (in ancient times) by BBSes to take uploaded files and put them into the proper category (Games, Utilities, etc.) and add their descriptions to the BBS download menus. In the last several years though, they are mostly used the same way any .txt files are used, although vestiges of a common format are still apparent, and there are several tools out there to generate the files from a form in a consistent way. They can't be assumed to be "structured data" anymore though. Simply open them with Notepad or any text editor / viewer.

These are text files that describe the files in the download. Often they contain information about how long the poster intends to take to complete the upload, their preferences for repost or fill requests, and other information. It is always a good idea to read the .nfo file before downloading a large number of files, to make sure that it's something you want, first of all, and so that you will know how the poster intends to go about uploading the files.

Starting with Windows XP, the ".nfo" file extension is automatically associated in Windows with the "System Information" application, which will simply give you an error that they are invalid files. The chance that you will ever actually need to save and open "System Information" files is virtually nonexistant - it is perfectly acceptable to right-click on one of these files, choose "Open with...", and select Notepad as the application to use. If you select "Always use this application to open files of this type", it will re-assign the application association from "System Information" to "Notepad", which will make your life easier. In the unlikely event that you actually do need to open a System Information file, you can simply right-click on it and choose open with... and System Information, so you are not losing anything important by doing this.

A lot of NFO, DIZ, TXT, etc. files included with files uploaded have got a ton of little odd characters all over them and may be hard to read. This is "Art". In order to see them properly in Notepad, change the font to a non-scaleable/proportional font (an un-kerned one) like Courier. Some are quite pretty. This will make it much easier to read the text buried in little ballons in the "Art" as well.

.nzb files
This file format is a descriptor of the entire posted set of files that you can load into many newsreaders (such as Newsbin Pro) to select the posts for download without having to find and select all the parts. It's a nice convenience, but you should be careful to deselect the extra Pxx or PAR2 files if you want to save download megabytes by not downloading them unless needed.

I hope this has been helpful, or at least interesting!
--technogeek