From: Peter Avalos Date: Sat, 29 Jan 2011 00:31:58 +0000 (-1000) Subject: Import file-5.05. X-Git-Tag: v2.11.0~270^2~17^2 X-Git-Url: https://gitweb.dragonflybsd.org/dragonfly.git/commitdiff_plain/e4d4ce0ce1d34ec601fa04afd1515865c3406361 Import file-5.05. --- diff --git a/contrib/file/ChangeLog b/contrib/file/ChangeLog index fff2f579c6..0399d91166 100644 --- a/contrib/file/ChangeLog +++ b/contrib/file/ChangeLog @@ -1,3 +1,101 @@ +2011-01-16 19:31 Reuben Thomas + + * Fix two potential buffer overruns in apprentice_list. + +2011-01-14 22:33 Reuben Thomas + + * New Python binding in pure Python. + * Update libmagic(3). + +2011-01-06 21:40 Reuben Thomas + + * Fix Python bindings (including recent Python 3 compatibility + update). + +2011-01-04 18:43 Reuben Thomas + + * magic/Makefile.am: make it easier to recover from magic build failures. + * Fix pstring length specifier parsing to avoid generating invalid + magic files. + * Add pstring length "J" (for "JPEG") to specify that the length + include itself. + * Fix JPEG comment parsing at last using pstring/HJ! + * Ignore section 5 man pages in doc/.cvsignore. + +2010-12-22 13:12 Christos Zoulas + + * Add pstring/BHhLl to specify the type of the length of pascal + strings. + +2010-11-26 18:39 Reuben Thomas + + * Fix "-e soft": it was ignored when softmagic was called + during asciimagic. + * Improve comments and use "unsigned char" in tar.h/is_tar.c. + +2010-11-05 17:26 Reuben Thomas + + * Make bug reporting addresses more visible. + +2010-11-01 18:35 Reuben Thomas + + * Add tcl magic from Gustaf Neumann + +2010-10-24 10:42 Christos Zoulas + + * Fix the whitespace comparing code (Christopher Chittleborough) + +2010-10-06 21:05 Christos Zoulas + + * allow string/t to work (Jan Kaluza) + +2010-09-20 22:11 Reuben Thomas + + * Apply some patches from Ubuntu and Fedora. + +2010-09-20 21:16 Reuben Thomas + + * Apply all patches from Debian package 5.04-6 which have not + already been applied and are not Debian-specific. + +2010-09-20 15:24 Reuben Thomas + + * Minor security fix to softmagic.c (don't use untrusted + string as printf format). + +2010-07-21 12:20 Christos Zoulas + + * MINGW32 portability from LRN + + * Don't warn about escaping magic regex chars when we are in a regex. + +2010-07-19 10:55 Christos Zoulas + + * Only try to print prpsinfo for core files. (Jan Kaluza) + +2010-04-22 12:55 Christos Zoulas + + * Try more elf offsets for Debian core files. (Arnaud Giersch) + +2010-02-20 15:18 Reuben Thomas + + * Clarify which sort of CDF we mean. + +2010-02-14 22:58 Reuben Thomas + + * Re-jig Zip file type magic so that unsupported special + Zip types (those with "mimetype" at offset 30) can be + recognized. + +2010-02-02 21:50 Reuben Thomas + + * Add support for OCF (EPUB) files (application/epub+zip) + +2010-01-28 18:25 Christos Zoulas + + * Fix core-dump from unbound loop: + https://bugzilla.redhat.com/show_bug.cgi?id=533245 + 2010-01-22 15:45 Christos Zoulas * print proper mime for crystal reports file diff --git a/contrib/file/README b/contrib/file/README index 901989a0e2..62343cc2f8 100644 --- a/contrib/file/README +++ b/contrib/file/README @@ -1,15 +1,22 @@ ** README for file(1) Command ** -@(#) $File: README,v 1.42 2009/02/14 15:16:24 christos Exp $ +@(#) $File: README,v 1.43 2010/11/05 17:25:55 rrt Exp $ -E-mail: christos@astron.com Mailing List: file@mx.gw.com +Bug tracker: http://bugs.gw.com/ +E-mail: christos@astron.com Phone: Do not even think of telephoning me about this program. Send cash first! This is Release 5.x of Ian Darwin's (copyright but distributable) -file(1) command. This version is the standard "file" command for Linux, +file(1) command, an implementation of the Unix File(1) command. +It knows the 'magic number' of several thousands of file types. +This version is the standard "file" command for Linux, *BSD, and other systems. (See "patchlevel.h" for the exact release number). +You can download the latest version of file from: + + ftp://ftp.astron.com/pub/file/ + The major changes for 5.x are CDF file parsing, indirect magic, and overhaul in mime and ascii encoding handling. @@ -102,33 +109,6 @@ guidelines: ------------------------------------------------------------------------------ -You can download the latest version of file from: - - ftp://ftp.astron.com/pub/file/ - -If your gzip sometimes fails to decompress things complaining about a short -file, apply this patch [which is going to be in the next version of gzip]: -*** - Tue Oct 29 02:06:35 1996 ---- util.c Sun Jul 21 21:51:38 1996 -*** 106,111 **** ---- 108,114 ---- - - if (insize == 0) { - if (eof_ok) return EOF; -+ flush_window(); - read_error(); - } - bytes_in += (ulg)insize; - Parts of this software were developed at SoftQuad Inc., developers of SGML/HTML/XML publishing software, in Toronto, Canada. -SoftQuad was swallowed up by Corel in 2002 -and does not exist any longer. - -From: Kees Zeelenberg - -An MS-Windows (Win32) port of File-4.17 is available from -http://gnuwin32.sourceforge.net/ - -File is an implementation of the Unix File(1) command. -It knows the 'magic number' of several thousands of file types. +SoftQuad was swallowed up by Corel in 2002 and does not exist any longer. diff --git a/contrib/file/README.DELETED b/contrib/file/README.DELETED index f7b28c4978..a564f72351 100644 --- a/contrib/file/README.DELETED +++ b/contrib/file/README.DELETED @@ -13,14 +13,15 @@ config.sub configure configure.ac depcomp -install-sh -ltmain.sh -missing doc/Makefile.am doc/Makefile.in +install-sh +ltmain.sh +m4/ magic/Makefile.am magic/Makefile.in -python +missing +python/ src/Makefile.am src/Makefile.in src/asprintf.c diff --git a/contrib/file/README.DRAGONFLY b/contrib/file/README.DRAGONFLY index 9d01cdedf3..c85cdfa498 100644 --- a/contrib/file/README.DRAGONFLY +++ b/contrib/file/README.DRAGONFLY @@ -2,9 +2,6 @@ This directory contains most of the file distribution. The original source can be obtained from: ftp://ftp.astron.com/pub/file/ - MD5 (file-5.04.tar.gz) = accade81ff1cc774904b47c72c8aeea0 - SHA1 (file-5.04.tar.gz) = 56ddf7135471aa656334ed8fefe1112bcccc2cc3 - A list of the omitted files and directories can be found in README.DELETED. diff --git a/contrib/file/doc/file.man b/contrib/file/doc/file.man index 444842ac46..2b01f74404 100644 --- a/contrib/file/doc/file.man +++ b/contrib/file/doc/file.man @@ -1,5 +1,5 @@ -.\" $File: file.man,v 1.82 2009/11/04 22:30:34 christos Exp $ -.Dd October 9, 2008 +.\" $File: file.man,v 1.87 2010/11/05 20:51:38 christos Exp $ +.Dd July 23, 2010 .Dt FILE __CSECTION__ .Os .Sh NAME @@ -8,7 +8,7 @@ .Sh SYNOPSIS .Nm .Bk -words -.Op Fl bchikLNnprsvz0 +.Op Fl bchiklLNnprsvz0 .Op Fl -apple .Op Fl -mime-encoding .Op Fl -mime-type @@ -104,7 +104,8 @@ magic file .Pa __MAGIC__.mgc , or the files in the directory .Pa __MAGIC__ -if the compiled file does not exist. In addition, if +if the compiled file does not exist. +In addition, if .Pa $HOME/.magic.mgc or .Pa $HOME/.magic @@ -177,13 +178,13 @@ flag to debug a new magic file before installing it. .It Fl e , -exclude Ar testname Exclude the test named in .Ar testname -from the list of tests made to determine the file type. Valid test names -are: +from the list of tests made to determine the file type. +Valid test names are: .Bl -tag -width compress .It apptype .Dv EMX application type (only on EMX). -.It text +.It ascii Various types of text files (this test will try to guess the text encoding, irrespective of the setting of the .Sq encoding option). @@ -204,7 +205,8 @@ Examines tar files. .El .It Fl F , -separator Ar separator Use the specified string as the separator between the filename and the -file result returned. Defaults to +file result returned. +Defaults to .Sq \&: . .It Fl f , -files-from Ar namefile Read the names of the files to be examined from @@ -219,13 +221,14 @@ to test the standard input, use as a filename argument. .It Fl h , -no-dereference option causes symlinks not to be followed -(on systems that support symbolic links). This is the default if the -environment variable +(on systems that support symbolic links). +This is the default if the environment variable .Dv POSIXLY_CORRECT is not defined. .It Fl i , -mime Causes the file command to output mime type strings rather than the more -traditional human readable ones. Thus it may say +traditional human readable ones. +Thus it may say .Sq text/plain; charset=us-ascii rather than .Sq ASCII text . @@ -240,13 +243,16 @@ Like .Fl i , but print only the specified element(s). .It Fl k , -keep-going -Don't stop at the first match, keep going. Subsequent matches will be +Don't stop at the first match, keep going. +Subsequent matches will be have the string .Sq "\[rs]012\- " prepended. (If you want a newline, see the .Sq "\-r" option.) +.It Fl l , -list +Print information about the strength of each magic pattern. .It Fl L , -dereference option causes symlinks to be followed, as the like-named option in .Xr ls 1 @@ -254,6 +260,8 @@ option causes symlinks to be followed, as the like-named option in This is the default if the environment variable .Dv POSIXLY_CORRECT is defined. +.It Fl l +Shows sorted patterns list in the order which is used for the matching. .It Fl m , -magic-file Ar magicfiles Specify an alternate list of files and directories containing magic. This can be a single item, or a colon-separated list. @@ -304,9 +312,11 @@ Try to look inside compressed files. .It Fl 0 , -print0 Output a null character .Sq \e0 -after the end of the filename. Nice to +after the end of the filename. +Nice to .Xr cut 1 -the output. This does not affect the separator which is still printed. +the output. +This does not affect the separator which is still printed. .It Fl -help Print a help message and exit. .El @@ -329,14 +339,20 @@ will not attempt to open adds .Sq .mgc to the value of this variable as appropriate. +However, +.Pa file +has to exist in order for +.Pa file.mime +to be considered. The environment variable .Dv POSIXLY_CORRECT controls (on systems that support symbolic links), whether .Nm -will attempt to follow symlinks or not. If set, then +will attempt to follow symlinks or not. +If set, then .Nm -follows symlink, otherwise it does not. This is also controlled -by the +follows symlink, otherwise it does not. +This is also controlled by the .Fl L and .Fl h @@ -345,7 +361,7 @@ options. .Xr magic __FSECTION__ , .Xr strings 1 , .Xr od 1 , -.Xr hexdump 1, +.Xr hexdump 1 , .Xr file 1posix .Sh STANDARDS CONFORMANCE This program is believed to exceed the System V Interface Definition @@ -478,9 +494,10 @@ Altered by Eric Fischer (enf@pobox.com), July, 2000, to identify character codes and attempt to identify the languages of non-ASCII files. .Pp -Altered by Reuben Thomas (rrt@sc3d.org), 2007 to 2008, to improve MIME +Altered by Reuben Thomas (rrt@sc3d.org), 2007-2009, to improve MIME support and merge MIME and non-MIME magic, support directories as well -as files of magic, apply many bug fixes and improve the build system. +as files of magic, apply many bug fixes, update and fix a lot of magic, +and improve the build system. .Pp The list of contributors to the .Sq magic @@ -491,7 +508,7 @@ Many contributors are listed in the source files. .Sh LEGAL NOTICE Copyright (c) Ian F. Darwin, Toronto, Canada, 1986-1999. Covered by the standard Berkeley Software Distribution copyright; see the file -LEGAL.NOTICE in the source distribution. +COPYING in the source distribution. .Pp The files .Dv tar.h @@ -502,9 +519,10 @@ were written by John Gilmore from his public-domain program, and are not covered by the above license. .Sh BUGS .Pp -There must be a better way to automate the construction of the Magic -file from all the glop in Magdir. -What is it? +Please report bugs and send patches to the bug tracker at +.Pa http://bugs.gw.com/ +or the mailing list at +.Aq file@mx.gw.com . .Pp .Nm uses several algorithms that favor speed over accuracy, diff --git a/contrib/file/doc/libmagic.man b/contrib/file/doc/libmagic.man index a9e5921577..32daf33aae 100644 --- a/contrib/file/doc/libmagic.man +++ b/contrib/file/doc/libmagic.man @@ -1,4 +1,4 @@ -.\" $File: libmagic.man,v 1.21 2009/11/24 21:16:14 christos Exp $ +.\" $File: libmagic.man,v 1.23 2011/01/14 21:59:17 rrt Exp $ .\" .\" Copyright (c) Christos Zoulas 2003. .\" All Rights Reserved. @@ -25,14 +25,14 @@ .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF .\" SUCH DAMAGE. .\" -.Dd November 24, 2009 +.Dd January 14, 2011 .Dt LIBMAGIC 3 .Os .Sh NAME .Nm magic_open , .Nm magic_close , .Nm magic_error , -.Nm magic_file , +.Nm magic_descriptor , .Nm magic_buffer , .Nm magic_setflags , .Nm magic_check , @@ -52,6 +52,8 @@ .Ft int .Fn magic_errno "magic_t cookie" .Ft const char * +.Fn magic_descriptor "magic_t cookie, "int fd" +.Ft const char * .Fn magic_file "magic_t cookie, const char *filename" .Ft const char * .Fn magic_buffer "magic_t cookie, const void *buffer, size_t length" @@ -92,6 +94,8 @@ and try to look in its contents. Return a MIME type string, instead of a textual description. .It Dv MAGIC_MIME_ENCODING Return a MIME encoding, instead of a textual description. +.It Dv MAGIC_MIME +A shorthand for MAGIC_MIME_TYPE | MAGIC_MIME_ENCODING. .It Dv MAGIC_CONTINUE Return all matches, not just the first. .It Dv MAGIC_CHECK @@ -101,32 +105,34 @@ On systems that support .Xr utime 2 or .Xr utimes 2 , -attempt to preserve the access time of files analyzed. +attempt to preserve the access time of files analysed. .It Dv MAGIC_RAW Don't translate unprintable characters to a \eooo octal representation. .It Dv MAGIC_ERROR Treat operating system errors while trying to open files and follow symlinks as real errors, instead of printing them in the magic buffer. +.It Dv MAGIC_APPLE +Return the Apple creator and type. .It Dv MAGIC_NO_CHECK_APPTYPE -Check for +Don't check for .Dv EMX application type (only on EMX). -.It Dv MAGIC_NO_CHECK_ASCII -Check for various types of ascii files. +.It Dv MAGIC_NO_CHECK_CDF +Don't get extra information on MS Composite Document Files. .It Dv MAGIC_NO_CHECK_COMPRESS -Don't look for, or inside compressed files. +Don't look inside compressed files. .It Dv MAGIC_NO_CHECK_ELF -Don't print elf details. -.It Dv MAGIC_NO_CHECK_FORTRAN -Don't look for fortran sequences inside ascii files. +Don't print ELF details. +.It Dv NO_CHECK_ENCODING +Don't check text encodings. .It Dv MAGIC_NO_CHECK_SOFT Don't consult magic files. .It Dv MAGIC_NO_CHECK_TAR Don't examine tar files. +.It Dv MAGIC_NO_CHECK_TEXT +Don't check for various types of text files. .It Dv MAGIC_NO_CHECK_TOKENS Don't look for known tokens inside ascii files. -.It Dv MAGIC_NO_CHECK_TROFF -Don't look for troff sequences inside ascii files. .El .Pp The @@ -156,6 +162,12 @@ If the is NULL, then stdin is used. .Pp The +.Fn magic_descriptor +function returns a textual description of the contents of the +.Ar fd +argument, or NULL if an error occurred. +.Pp +The .Fn magic_buffer function returns a textual description of the contents of the .Ar buffer @@ -246,4 +258,5 @@ The compiled default magic database. .Sh AUTHORS Måns Rullgård Initial libmagic implementation, and configuration. +.br Christos Zoulas API cleanup, error code and allocation handling. diff --git a/contrib/file/doc/magic.man b/contrib/file/doc/magic.man index f95280cf24..8d16a30175 100644 --- a/contrib/file/doc/magic.man +++ b/contrib/file/doc/magic.man @@ -1,4 +1,4 @@ -.\" $File: magic.man,v 1.60 2009/05/08 23:02:44 christos Exp $ +.\" $File: magic.man,v 1.66 2011/01/06 23:54:41 rrt Exp $ .Dd August 30, 2008 .Dt MAGIC __FSECTION__ .Os @@ -71,9 +71,31 @@ characters in the magic match both lower and upper case characters in the target, whereas upper case characters in the magic only match uppercase characters in the target. .It Dv pstring -A Pascal-style string where the first byte is interpreted as the an +A Pascal-style string where the first byte/short/int is interpreted as the an unsigned length. +The length defaults to byte and can be specified as a modifier. +The following modifiers are supported: +.Bl -tag -compact -width B +.It B +A byte length (default). +.It H +A 2 byte big endian length. +.It h +A 2 byte big little length. +.It L +A 4 byte big endian length. +.It l +A 4 byte big little length. +.It J +The length includes itself in its count. +.El The string is not NUL terminated. +.Dq J +is used rather than the more +valuable +.Dq I +because this type of length is a feature of the JPEG +format. .It Dv date A four-byte value interpreted as a UNIX date. .It Dv qdate @@ -283,7 +305,7 @@ then print the string), with The special test .Em x always evaluates to true. -.Dv message +.It Dv message The message to be printed if the comparison succeeds. If the string contains a .Xr printf 3 @@ -344,11 +366,11 @@ on the line indicates the level of the test; a line with no .Em \*[Gt] at the beginning is considered to be at level 0. Tests are arranged in a tree-like hierarchy: -If a the test on a line at level +if the test on a line at level .Em n succeeds, all following tests at level .Em n+1 -are performed, and the messages printed if the tests succeed, untile a line +are performed, and the messages printed if the tests succeed, until a line with level .Em n (or less) appears. @@ -365,7 +387,7 @@ being examined. If the first character following the last .Em \*[Gt] is a -.Em ( +.Em \&( then the string after the parenthesis is interpreted as an indirect offset. That means that the number after the parenthesis is used as an offset in the file. diff --git a/contrib/file/magic/Header b/contrib/file/magic/Header index 3ca9b0eb2c..831122e27a 100644 --- a/contrib/file/magic/Header +++ b/contrib/file/magic/Header @@ -1,5 +1,5 @@ -# Magic # Magic data for file(1) command. -# Machine-generated from src/cmd/file/magdir/*; edit there only! # Format is described in magic(files), where: -# files is 5 on V7 and BSD, 4 on SV, and ?? in the SVID. +# files is 5 on V7 and BSD, 4 on SV, and ?? on SVID. +# Don't edit this file, edit /etc/magic or send your magic improvements +# to the maintainers, at file@mx.gw.com diff --git a/contrib/file/magic/Magdir/adventure b/contrib/file/magic/Magdir/adventure index 5087ce66c4..febc2077f1 100644 --- a/contrib/file/magic/Magdir/adventure +++ b/contrib/file/magic/Magdir/adventure @@ -1,6 +1,6 @@ #------------------------------------------------------------------------------ -# $File: adventure,v 1.10 2009/09/19 16:28:07 christos Exp $ +# $File: adventure,v 1.13 2010/12/31 16:32:54 christos Exp $ # adventure: file(1) magic for Adventure game files # # from Allen Garvin @@ -17,18 +17,26 @@ # Infocom (see z-machine) #------------------------------------------------------------------------------ # Z-machine: file(1) magic for Z-machine binaries. +# Updated by Adam Buchbinder # -# This will match ${TEX_BASE}/texmf/omega/ocp/char2uni/inbig5.ocp which -# appears to be a version-0 Z-machine binary. +#http://www.gnelson.demon.co.uk/zspec/sect11.html +#http://www.jczorkmid.net/~jpenney/ZSpec11-latest.txt +#http://en.wikipedia.org/wiki/Z-machine +# The first byte is the Z-machine revision; it is always between 1 and 8. We +# had false matches (for instance, inbig5.ocp from the Omega TeX extension as +# well as an occasional MP3 file), so we sanity-check the version number. # -# The (false match) message is to correct that behavior. Perhaps it is -# not needed. +# It might be possible to sanity-check the release number as well, as it seems +# (at least in classic Infocom games) to always be a relatively small number, +# always under 150 or so, but as this isn't rigorous, we'll wait on that until +# it becomes clear that it's needed. # -16 belong&0xfe00f0f0 0x3030 Infocom game data ->0 ubyte 0 (false match) ->0 ubyte >0 (Z-machine %d, ->>2 ubeshort x Release %d / ->>18 string >\0 Serial %.6s) +0 ubyte >0 +>0 ubyte <9 +>>16 belong&0xfe00f0f0 0x3030 Infocom game data +>>>0 ubyte x (Z-machine %d, +>>>>2 ubeshort x Release %d / +>>>>18 string >\0 Serial %.6s) #------------------------------------------------------------------------------ # Glulx: file(1) magic for Glulx binaries. @@ -46,10 +54,9 @@ # For Quetzal and blorb magic see iff -# TADS (Text Adventure Development System) +# TADS (Text Adventure Development System) version 2 # All files are machine-independent (games compile to byte-code) and are tagged -# with a version string of the form "V2..\0" (but TADS 3 is -# on the way). +# with a version string of the form "V2..\0". # Game files start with "TADS2 bin\n\r\032\0" then the compiler version. 0 string TADS2\ bin TADS >9 belong !0x0A0D1A00 game data, CORRUPTED @@ -74,6 +81,19 @@ >10 belong 0x0A0D1A00 >>14 string >\0 %s saved game data +# TADS (Text Adventure Development System) version 3 +# Game files start with "T3-image\015\012\032" +0 string T3-image\015\012\032 +>11 leshort x TADS 3 game data (format version %d) +# Saved game files start with "T3-state-v####\015\012\032" +# where #### is a format version number +0 string T3-state-v +>14 string \015\012\032 TADS 3 saved game data (format version +>>10 byte x %c +>>11 byte x \b%c +>>12 byte x \b%c +>>13 byte x \b%c) + # Danny Milosavljevic # this are adrift (adventure game standard) game files, extension .taf # depending on version magic continues with 0x93453E6139FA (V 4.0) diff --git a/contrib/file/magic/Magdir/animation b/contrib/file/magic/Magdir/animation index 5fdb3d02ba..e583696353 100644 --- a/contrib/file/magic/Magdir/animation +++ b/contrib/file/magic/Magdir/animation @@ -1,6 +1,6 @@ #------------------------------------------------------------------------------ -# $File: animation,v 1.39 2009/09/27 19:02:12 christos Exp $ +# $File: animation,v 1.44 2010/11/25 15:00:12 christos Exp $ # animation: file(1) magic for animation/movie formats # # animation formats @@ -44,8 +44,16 @@ >8 string mp7b \b, MPEG v4 system, MPEG v7 binary XML >8 string/W jp2 \b, JPEG 2000 !:mime image/jp2 +>8 string 3ge \b, MPEG v4 system, 3GPP +!:mime video/3gpp +>8 string 3gg \b, MPEG v4 system, 3GPP +!:mime video/3gpp >8 string 3gp \b, MPEG v4 system, 3GPP !:mime video/3gpp +>8 string 3gs \b, MPEG v4 system, 3GPP +!:mime video/3gpp +>8 string 3g2 \b, MPEG v4 system, 3GPP2 +!:mime video/3gpp2 >>11 byte 4 \b v4 (H.263/AMR GSM 6.10) >>11 byte 5 \b v5 (H.263/AMR GSM 6.10) >>11 byte 6 \b v6 (ITU H.264/AMR GSM 6.10) @@ -152,6 +160,7 @@ >>4 byte 252 \b, FGS @ L4 >>4 byte 253 \b, FGS @ L5 >3 byte 0xB5 MPEG sequence, v4 +!:mime video/mpeg4-generic >>4 byte &0x80 >>>5 byte&0xF0 16 \b, video (missing profile header) >>>5 byte&0xF0 32 \b, still texture (missing profile header) @@ -162,6 +171,7 @@ >>4 byte&0xF8 24 \b, mesh (missing profile header) >>4 byte&0xF8 32 \b, face (missing profile header) >3 byte 0xB3 MPEG sequence +!:mime video/mpeg >>12 belong 0x000001B8 \b, v1, progressive Y'CbCr 4:2:0 video >>12 belong 0x000001B2 \b, v1, progressive Y'CbCr 4:2:0 video >>12 belong 0x000001B5 \b, v2, @@ -731,7 +741,7 @@ # X3D (Extensible 3D) [http://www.web3d.org/specifications/x3d-3.0.dtd] # From Michel Briand -0 string \20 search/1000/cw \ 2008-07-18 0 string BIK Bink Video >3 regex =[a-z] rev.%s @@ -819,3 +830,8 @@ >>51 byte&0x20 !0 stereo #>>51 byte&0x10 0 FFT #>>51 byte&0x10 !0 DCT + +# Type: NUT Container +# URL: http://wiki.multimedia.cx/index.php?title=NUT +# From: Adam Buchbinder +0 string nut/multimedia\ container\0 NUT multimedia container diff --git a/contrib/file/magic/Magdir/apple b/contrib/file/magic/Magdir/apple index 0d04c518cd..dad3eee925 100644 --- a/contrib/file/magic/Magdir/apple +++ b/contrib/file/magic/Magdir/apple @@ -1,9 +1,9 @@ #------------------------------------------------------------------------------ -# $File: apple,v 1.23 2009/09/19 16:28:08 christos Exp $ +# $File: apple,v 1.24 2010/11/25 15:00:12 christos Exp $ # apple: file(1) magic for Apple file formats # -0 search/1 FiLeStArTfIlEsTaRt binscii (apple ][) text +0 search/1/t FiLeStArTfIlEsTaRt binscii (apple ][) text 0 string \x0aGL Binary II (apple ][) data 0 string \x76\xff Squeezed (apple ][) data 0 string NuFile NuFile archive (apple ][) data diff --git a/contrib/file/magic/Magdir/archive b/contrib/file/magic/Magdir/archive index 07c509c1a0..c998e5b899 100644 --- a/contrib/file/magic/Magdir/archive +++ b/contrib/file/magic/Magdir/archive @@ -1,6 +1,5 @@ - #------------------------------------------------------------------------------ -# $File: archive,v 1.55 2009/12/04 15:00:47 christos Exp $ +# $File: archive,v 1.62 2011/01/07 20:24:25 christos Exp $ # archive: file(1) magic for archive formats (see also "msdos" for self- # extracting compressed archives) # @@ -245,13 +244,13 @@ # MS Compress 4 string \x88\xf0\x27 MS Compress archive data # updated by Joerg Jenderek ->9 string \0 ->>0 string KWAJ +>9 string \0 +>>0 string KWAJ >>>7 string \321\003 MS Compress archive data >>>>14 ulong >0 \b, original size: %ld bytes ->>>>18 ubyte >0x65 ->>>>>18 string x \b, was %.8s ->>>>>(10.b-4) string x \b.%.3s +>>>>18 ubyte >0x65 +>>>>>18 string x \b, was %.8s +>>>>>(10.b-4) string x \b.%.3s # MP3 (archiver, not lossy audio compression) 0 string MP3\x1a MP3-Archiver archive data # ZET @@ -276,7 +275,7 @@ # Splint 0 string \x93\xb9\x06 Splint archive data # InstallShield -0 string \x13\x5d\x65\x8c InstallShield Z archive Data +0 string \x13\x5d\x65\x8c InstallShield Z archive Data # Gather 1 string GTH Gather archive data # BOA @@ -535,7 +534,7 @@ >20 byte x - header level %d # taken from idarc [JW] 2 string -lZ PUT archive data -2 string -lz LZS archive data +2 string -lz LZS archive data 2 string -sw1- Swag archive data # RAR archiver (Greg Roelofs, newt@uchicago.edu) @@ -566,29 +565,20 @@ 0 string PK\x07\x08PK\x03\x04 Zip multi-volume archive data, at least PKZIP v2.50 to extract !:mime application/zip -# ZIP archives (Greg Roelofs, c/o zip-bugs@wkuvx1.wku.edu) +# Zip archives (Greg Roelofs, c/o zip-bugs@wkuvx1.wku.edu) 0 string PK\003\004 ->30 ubelong !0x6d696d65 ->>4 byte 0x00 Zip archive data -!:mime application/zip ->>4 byte 0x09 Zip archive data, at least v0.9 to extract -!:mime application/zip ->>4 byte 0x0a Zip archive data, at least v1.0 to extract -!:mime application/zip ->>4 byte 0x0b Zip archive data, at least v1.1 to extract -!:mime application/zip ->>0x161 string WINZIP Zip archive data, WinZIP self-extracting -!:mime application/zip ->>4 byte 0x14 Zip archive data, at least v2.0 to extract -!:mime application/zip -# OpenOffice.org / KOffice / StarOffice documents -# Listed here because they ARE zip files -# -# From: Abel Cheung ->30 string mimetype +# Specialised zip formats which start with a member named 'mimetype' +# (stored uncompressed, with no 'extra field') containing the file's MIME type. +# Check for have 8-byte name, 0-byte extra field, name "mimetype", and +# contents starting with "application/": +>26 string \x8\0\0\0mimetypeapplication/ -# KOffice (1.2 or above) formats +# KOffice / OpenOffice & StarOffice / OpenDocument formats +# From: Abel Cheung + +# KOffice (1.2 or above) formats +# (mimetype contains "application/vnd.kde.") >>50 string vnd.kde. KOffice (>=1.2) >>>58 string karbon Karbon document >>>58 string kchart KChart document @@ -599,7 +589,8 @@ >>>58 string kspread KSpread document >>>58 string kword KWord document -# OpenOffice formats (for OpenOffice 1.x / StarOffice 6/7) +# OpenOffice formats (for OpenOffice 1.x / StarOffice 6/7) +# (mimetype contains "application/vnd.sun.xml.") >>50 string vnd.sun.xml. OpenOffice.org 1.x >>>62 string writer Writer >>>>68 byte !0x2e document @@ -617,8 +608,9 @@ >>>62 string math Math document >>>62 string base Database file -# OpenDocument formats (for OpenOffice 2.x / StarOffice >= 8) -# http://lists.oasis-open.org/archives/office/200505/msg00006.html +# OpenDocument formats (for OpenOffice 2.x / StarOffice >= 8) +# http://lists.oasis-open.org/archives/office/200505/msg00006.html +# (mimetype contains "application/vnd.oasis.opendocument.") >>50 string vnd.oasis.opendocument. OpenDocument >>>73 string text >>>>77 byte !0x2d Text @@ -662,6 +654,41 @@ >>>>78 string -template Template !:mime application/vnd.oasis.opendocument.image-template +# EPUB (OEBPS) books using OCF (OEBPS Container Format) +# From: Adam Buchbinder +# http://www.idpf.org/ocf/ocf1.0/download/ocf10.htm, section 4. +# (mimetype contains "application/epub+zip") +>>50 string epub+zip EPUB ebook data +!:mime application/epub+zip + +# Catch other ZIP-with-mimetype formats +# In a ZIP file, the bytes immediately after a member's contents are +# always "PK". The 2 regex rules here print the "mimetype" member's +# contents up to the first 'P'. Luckily, most MIME types don't contain +# any capital 'P's. This is a kludge. +# (mimetype contains "application/") +>>50 string !epub+zip +>>>50 string !vnd.oasis.opendocument. +>>>>50 string !vnd.sun.xml. +>>>>>50 string !vnd.kde. +>>>>>>38 regex [!-OQ-~]+ Zip data (MIME type "%s"?) +!:mime application/zip +# (mimetype contents other than "application/*") +>26 string \x8\0\0\0mimetype +>>38 string !application/ +>>>38 regex [!-OQ-~]+ Zip data (MIME type "%s"?) +!:mime application/zip + +# Generic zip archives (Greg Roelofs, c/o zip-bugs@wkuvx1.wku.edu) +# Next line excludes specialized formats: +>26 string !\x8\0\0\0mimetype Zip archive data +!:mime application/zip +>>4 byte 0x09 \b, at least v0.9 to extract +>>4 byte 0x0a \b, at least v1.0 to extract +>>4 byte 0x0b \b, at least v1.1 to extract +>>0x161 string WINZIP \b, WinZIP self-extracting +>>4 byte 0x14 \b, at least v2.0 to extract + # Zoo archiver 20 lelong 0xfdc4a7dc Zoo archive data !:mime application/x-zoo @@ -679,7 +706,7 @@ !:mime application/octet-stream # -# LBR. NB: May conflict with the questionable +# LBR. NB: May conflict with the questionable # "binary Computer Graphics Metafile" format. # 0 string \0\ \ \ \ \ \ \ \ \ \ \ \0\0 LBR archive data @@ -695,10 +722,10 @@ # From Rafael Laboissiere # The Project Revision Control System (see # http://prcs.sourceforge.net) generates a packaged project -# file which is recognized by the following entry: +# file which is recognized by the following entry: 0 leshort 0xeb81 PRCS packaged project -# Microsoft cabinets +# Microsoft cabinets # by David Necas (Yeti) #0 string MSCF\0\0\0\0 Microsoft cabinet file data, #>25 byte x v%d @@ -706,7 +733,7 @@ # MPi: All CABs have version 1.3, so this is pointless. # Better magic in debian-additions. -# GTKtalog catalogs +# GTKtalog catalogs # by David Necas (Yeti) 4 string gtktalog\ GTKtalog catalog data, >13 string 3 version 3 @@ -725,12 +752,12 @@ !:mime application/x-bittorrent # Atari MSA archive - Teemu Hukkanen -0 beshort 0x0e0f Atari MSA archive data ->2 beshort x \b, %d sectors per track ->4 beshort 0 \b, 1 sided ->4 beshort 1 \b, 2 sided ->6 beshort x \b, starting track: %d ->8 beshort x \b, ending track: %d +0 beshort 0x0e0f Atari MSA archive data +>2 beshort x \b, %d sectors per track +>4 beshort 0 \b, 1 sided +>4 beshort 1 \b, 2 sided +>6 beshort x \b, starting track: %d +>8 beshort x \b, ending track: %d # Alternate ZIP string (amc@arwen.cs.berkeley.edu) 0 string PK00PK\003\004 Zip archive data @@ -775,7 +802,7 @@ # DR-DOS 7.03 Packed File *.??_ 0 string Packed\ File\ Personal NetWare Packed File ->12 string x \b, was "%.12s" +>12 string x \b, was "%.12s" # EET archive # From: Tilman Sauerbeck @@ -830,3 +857,29 @@ >24 belong 0 no checksum >24 belong 1 SHA-1 checksum >24 belong 2 MD5 checksum + +# Type: Parity Archive +# From: Daniel van Eeden +0 string PAR2 Parity Archive Volume Set + +# Bacula volume format. (Volumes always start with a block header.) +# URL: http://bacula.org/3.0.x-manuals/en/developers/developers/Block_Header.html +# From: Adam Buchbinder +12 string BB02 Bacula volume +>20 bedate x \b, started %s + +# ePub is XHTML + XML inside a ZIP archive. The first member of the +# archive must be an uncompressed file called 'mimetype' with contents +# 'application/epub+zip' + +# start by checking that this is a ZIP archive, then check for the +# proper mimetype file +# From: Ralf Brown +0 string PK\003\004 +>0x1E string mimetypeapplication/epub+zip EPUB document +!:mime application/epub+zip + +# From: "Michał Górny" +# ZPAQ: http://mattmahoney.net/dc/zpaq.html +0 string zPQ ZPAQ stream +>3 byte x \b, level %d diff --git a/contrib/file/magic/Magdir/audio b/contrib/file/magic/Magdir/audio index e9694919bd..483a656992 100644 --- a/contrib/file/magic/Magdir/audio +++ b/contrib/file/magic/Magdir/audio @@ -1,6 +1,6 @@ #------------------------------------------------------------------------------ -# $File: audio,v 1.59 2009/11/04 17:27:37 christos Exp $ +# $File: audio,v 1.61 2010/09/20 19:19:16 rrt Exp $ # audio: file(1) magic for sound formats (see also "iff") # # Jan Nicolai Langfeldt (janl@ifi.uio.no), Dan Quinlan (quinlan@yggdrasil.com), @@ -118,7 +118,7 @@ # Real Audio (Magic .ra\0375) 0 belong 0x2e7261fd RealAudio sound file !:mime audio/x-pn-realaudio -0 string .RMF RealMedia file +0 string .RMF\0\0\0 RealMedia file !:mime application/vnd.rn-realmedia #video/x-pn-realvideo #video/vnd.rn-realvideo @@ -308,6 +308,15 @@ >122 byte&0x1 =1 PAL >122 byte&0x1 =0 NTSC +# Type: SNES SPC700 sound files +# From: Josh Triplett +0 string SNES-SPC700\ Sound\ File\ Data\ v SNES SPC700 sound file +>&0 string 0.30 \b, version %s +>>0x23 byte 0x1B \b, without ID666 tag +>>0x23 byte 0x1A \b, with ID666 tag +>>>0x2E string >\0 \b, song "%.32s" +>>>0x4E string >\0 \b, game "%.32s" + # Impulse tracker module (audio/x-it) 0 string IMPM Impulse Tracker module sound data - !:mime audio/x-mod @@ -606,3 +615,8 @@ # URL: http://filext.com/detaillist.php?extdetail=AMR # From: Russell Coker 0 string #!AMR Adaptive Multi-Rate Codec (GSM telephony) + +# Type: SuperCollider 3 Synth Definition File Format +# From: Mario Lang +0 string SCgf SuperCollider3 Synth Definition file, +>4 belong x version %d diff --git a/contrib/file/magic/Magdir/blcr b/contrib/file/magic/Magdir/blcr new file mode 100644 index 0000000000..9ccd4dc4f0 --- /dev/null +++ b/contrib/file/magic/Magdir/blcr @@ -0,0 +1,25 @@ +# Berkeley Lab Checkpoint Restart (BLCR) checkpoint context files +# http://ftg.lbl.gov/checkpoint +0 string C\0\0\0R\0\0\0 BLCR +>16 lelong 1 x86 +>16 lelong 3 alpha +>16 lelong 5 x86-64 +>16 lelong 7 ARM +>8 lelong x context data (little endian, version %d) +# Uncomment the following only of your "file" program supports "search" +#>0 search/1024 VMA\06 for kernel +#>>&1 byte x %d. +#>>&2 byte x %d. +#>>&3 byte x %d +0 string \0\0\0C\0\0\0R BLCR +>16 belong 2 SPARC +>16 belong 4 ppc +>16 belong 6 ppc64 +>16 belong 7 ARMEB +>16 belong 8 SPARC64 +>8 belong x context data (big endian, version %d) +# Uncomment the following only of your "file" program supports "search" +#>0 search/1024 VMA\06 for kernel +#>>&1 byte x %d. +#>>&2 byte x %d. +#>>&3 byte x %d diff --git a/contrib/file/magic/Magdir/bsi b/contrib/file/magic/Magdir/bsi new file mode 100644 index 0000000000..51a62891c2 --- /dev/null +++ b/contrib/file/magic/Magdir/bsi @@ -0,0 +1,9 @@ +# Chiasmus is a encryption standard developed by the German Federal +# Office for Information Security (Bundesamt fuer Sicherheit in der +# Informationstechnik). + +# Extension: .xia +0 string XIA1 Chiasmus encrypted data + +# Extension: .xis +0 string XIS Chiasmus key diff --git a/contrib/file/magic/Magdir/cad b/contrib/file/magic/Magdir/cad index f2b0eba7cb..8069b1d47e 100644 --- a/contrib/file/magic/Magdir/cad +++ b/contrib/file/magic/Magdir/cad @@ -1,6 +1,6 @@ #------------------------------------------------------------------------------ -# $File: cad,v 1.9 2009/09/19 16:28:08 christos Exp $ +# $File: cad,v 1.10 2010/12/25 14:33:43 christos Exp $ # autocad: file(1) magic for cad files # @@ -51,9 +51,54 @@ # AutoCad, from Nahuel Greco # AutoCAD DWG versions R12/R13/R14 (www.autodesk.com) -0 string AC1012 AutoCad (release 12) -0 string AC1013 AutoCad (release 13) -0 string AC1014 AutoCad (release 14) +0 string AC1012 DWG AutoDesk AutoCad (release 12) +0 string AC1013 DWG AutoDesk AutoCad (release 13) +0 string AC1014 DWG AutoDesk AutoCad (release 14) +# A new version of AutoCAD DWG +# Sergey Zaykov (mail_of_sergey@mail.ru, sergey_zaikov@rambler.ru, +# ICQ 358572321) +# From various sources like: +# http://autodesk.blogs.com/between_the_lines/autocad-release-history.html +0 string AC1018 DWG AutoDesk AutoCAD 2004/2005/2006 +0 string AC1021 DWG AutoDesk AutoCAD 2007/2008/2009 +0 string AC1024 DWG AutoDesk AutoCAD 2010/2011 + +# KOMPAS 2D drawing from ASCON +# This is KOMPAS 2D drawing or fragment of drawing but is not detailed nor +# gathered nor specification +# ASCON http://ascon.net/main/ in English, +# http://ascon.ru/ main site in Russian +# Extension is CDW for drawing and FRW for fragment of drawing +# Sergey Zaykov (mail_of_sergey@mail.ru, sergey_zaikov@rambler.ru, +# ICQ 358572321, http://vkontakte.ru/id16076543) +# From: +# http://sd.ascon.ru/otrs/customer.pl?Action=CustomerFAQ&CategoryID=4&ItemID=292 +# (in russian) and my experiments +0 string KF +>2 belong 0x4E00000C Kompas drawing 12.0 SP1 +>2 belong 0x4D00000C Kompas drawing 12.0 +>2 belong 0x3200000B Kompas drawing 11.0 SP1 +>2 belong 0x3100000B Kompas drawing 11.0 +>2 belong 0x2310000A Kompas drawing 10.0 SP1 +>2 belong 0x2110000A Kompas drawing 10.0 +>2 belong 0x08000009 Kompas drawing 9.0 SP1 +>2 belong 0x05000009 Kompas drawing 9.0 +>2 belong 0x33010008 Kompas drawing 8+ +>2 belong 0x1A000008 Kompas drawing 8.0 +>2 belong 0x2C010107 Kompas drawing 7+ +>2 belong 0x05000007 Kompas drawing 7.0 +>2 belong 0x32000006 Kompas drawing 6+ +>2 belong 0x09000006 Kompas drawing 6.0 +>2 belong 0x5C009005 Kompas drawing 5.11R03 +>2 belong 0x54009005 Kompas drawing 5.11R02 +>2 belong 0x51009005 Kompas drawing 5.11R01 +>2 belong 0x22009005 Kompas drawing 5.10R03 +>2 belong 0x22009005 Kompas drawing 5.10R02 mar +>2 belong 0x21009005 Kompas drawing 5.10R02 febr +>2 belong 0x19009005 Kompas drawing 5.10R01 +>2 belong 0xF4008005 Kompas drawing 5.9R01.003 +>2 belong 0x1C008005 Kompas drawing 5.9R01.002 +>2 belong 0x11008005 Kompas drawing 5.8R01.003 # CAD: file(1) magic for computer aided design files # Phillip Griffith diff --git a/contrib/file/magic/Magdir/chord b/contrib/file/magic/Magdir/chord index 134ee81f9e..00d0bec65a 100644 --- a/contrib/file/magic/Magdir/chord +++ b/contrib/file/magic/Magdir/chord @@ -1,6 +1,6 @@ #------------------------------------------------------------------------------ -# $File: chord,v 1.4 2009/09/19 16:28:08 christos Exp $ +# $File: chord,v 1.5 2010/09/20 19:19:16 rrt Exp $ # chord: file(1) magic for Chord music sheet typesetting utility input files # # From Philippe De Muyter @@ -8,3 +8,8 @@ # 0 string {title Chord text file +# Type: PowerTab file format +# URL: http://www.power-tab.net/ +# From: Jelmer Vernooij +0 string ptab\003\000 Power-Tab v3 Tablature File +0 string ptab\004\000 Power-Tab v4 Tablature File diff --git a/contrib/file/magic/Magdir/commands b/contrib/file/magic/Magdir/commands index 7874de7e2c..ae3756f45e 100644 --- a/contrib/file/magic/Magdir/commands +++ b/contrib/file/magic/Magdir/commands @@ -1,70 +1,74 @@ #------------------------------------------------------------------------------ -# $File: commands,v 1.36 2010/01/24 18:41:11 christos Exp $ +# $File: commands,v 1.39 2010/11/25 15:00:12 christos Exp $ # commands: file(1) magic for various shells and interpreters # -#0 string : shell archive or script for antique kernel text -0 string/w #!\ /bin/sh POSIX shell script text executable +#0 string/w : shell archive or script for antique kernel text +0 string/wt #!\ /bin/sh POSIX shell script text executable !:mime text/x-shellscript -0 string/w #!\ /bin/csh C shell script text executable +0 string/wt #!\ /bin/csh C shell script text executable !:mime text/x-shellscript # korn shell magic, sent by George Wu, gwu@clyde.att.com -0 string/w #!\ /bin/ksh Korn shell script text executable +0 string/wt #!\ /bin/ksh Korn shell script text executable !:mime text/x-shellscript -0 string/w #!\ /bin/tcsh Tenex C shell script text executable +0 string/wt #!\ /bin/tcsh Tenex C shell script text executable !:mime text/x-shellscript -0 string/w #!\ /usr/local/tcsh Tenex C shell script text executable +0 string/wt #!\ /usr/bin/tcsh Tenex C shell script text executable !:mime text/x-shellscript -0 string/w #!\ /usr/local/bin/tcsh Tenex C shell script text executable +0 string/wt #!\ /usr/local/tcsh Tenex C shell script text executable +!:mime text/x-shellscript +0 string/wt #!\ /usr/local/bin/tcsh Tenex C shell script text executable !:mime text/x-shellscript # # zsh/ash/ae/nawk/gawk magic from cameron@cs.unsw.oz.au (Cameron Simpson) -0 string/w #!\ /bin/zsh Paul Falstad's zsh script text executable +0 string/wt #!\ /bin/zsh Paul Falstad's zsh script text executable !:mime text/x-shellscript -0 string/w #!\ /usr/bin/zsh Paul Falstad's zsh script text executable +0 string/wt #!\ /usr/bin/zsh Paul Falstad's zsh script text executable !:mime text/x-shellscript -0 string/w #!\ /usr/local/bin/zsh Paul Falstad's zsh script text executable +0 string/wt #!\ /usr/local/bin/zsh Paul Falstad's zsh script text executable !:mime text/x-shellscript -0 string/w #!\ /usr/local/bin/ash Neil Brown's ash script text executable +0 string/wt #!\ /usr/local/bin/ash Neil Brown's ash script text executable !:mime text/x-shellscript -0 string/w #!\ /usr/local/bin/ae Neil Brown's ae script text executable +0 string/wt #!\ /usr/local/bin/ae Neil Brown's ae script text executable !:mime text/x-shellscript -0 string/w #!\ /bin/nawk new awk script text executable +0 string/wt #!\ /bin/nawk new awk script text executable !:mime text/x-nawk -0 string/w #!\ /usr/bin/nawk new awk script text executable +0 string/wt #!\ /usr/bin/nawk new awk script text executable !:mime text/x-nawk -0 string/w #!\ /usr/local/bin/nawk new awk script text executable +0 string/wt #!\ /usr/local/bin/nawk new awk script text executable !:mime text/x-nawk -0 string/w #!\ /bin/gawk GNU awk script text executable +0 string/wt #!\ /bin/gawk GNU awk script text executable !:mime text/x-gawk -0 string/w #!\ /usr/bin/gawk GNU awk script text executable +0 string/wt #!\ /usr/bin/gawk GNU awk script text executable !:mime text/x-gawk -0 string/w #!\ /usr/local/bin/gawk GNU awk script text executable +0 string/wt #!\ /usr/local/bin/gawk GNU awk script text executable !:mime text/x-gawk # -0 string/w #!\ /bin/awk awk script text executable +0 string/wt #!\ /bin/awk awk script text executable !:mime text/x-awk -0 string/w #!\ /usr/bin/awk awk script text executable +0 string/wt #!\ /usr/bin/awk awk script text executable !:mime text/x-awk -# update to distinguish from *.vcf files -# this is broken because postscript has /EBEGIN{ for example. -#0 search/Ww BEGIN { awk script text +0 regex =^\\s*BEGIN\\s*[{] awk script text # AT&T Bell Labs' Plan 9 shell -0 string/w #!\ /bin/rc Plan 9 rc shell script text executable +0 string/wt #!\ /bin/rc Plan 9 rc shell script text executable # bash shell magic, from Peter Tobias (tobias@server.et-inf.fho-emden.de) -0 string/w #!\ /bin/bash Bourne-Again shell script text executable +0 string/wt #!\ /bin/bash Bourne-Again shell script text executable +!:mime text/x-shellscript +0 string/wt #!\ /usr/bin/bash Bourne-Again shell script text executable !:mime text/x-shellscript -0 string/w #!\ /usr/local/bin/bash Bourne-Again shell script text executable +0 string/wt #!\ /usr/local/bash Bourne-Again shell script text executable +!:mime text/x-shellscript +0 string/wt #!\ /usr/local/bin/bash Bourne-Again shell script text executable !:mime text/x-shellscript # using env -0 string #!/usr/bin/env a ->15 string >\0 %s script text executable -0 string #!\ /usr/bin/env a ->16 string >\0 %s script text executable +0 string/t #!/usr/bin/env a +>15 string/t >\0 %s script text executable +0 string/t #!\ /usr/bin/env a +>16 string/t >\0 %s script text executable # PHP scripts # Ulf Harnhammar @@ -81,4 +85,9 @@ 0 string Zend\x00 PHP script Zend Optimizer data -0 string $! DCL command file +0 string/t $! DCL command file + +# Type: Pdmenu +# URL: http://packages.debian.org/pdmenu +# From: Edward Betts +0 string #!/usr/bin/pdmenu Pdmenu configuration file text diff --git a/contrib/file/magic/Magdir/compress b/contrib/file/magic/Magdir/compress index d52948663e..7a6e7150dd 100644 --- a/contrib/file/magic/Magdir/compress +++ b/contrib/file/magic/Magdir/compress @@ -1,6 +1,5 @@ - #------------------------------------------------------------------------------ -# $File: compress,v 1.42 2009/09/19 16:28:08 christos Exp $ +# $File: compress,v 1.46 2010/09/20 19:19:17 rrt Exp $ # compress: file(1) magic for pure-compression formats (no archives) # # compress, gzip, pack, compact, huf, squeeze, crunch, freeze, yabba, etc. @@ -20,7 +19,7 @@ # Edited by Chris Chittleborough , March 2002 # * Original filename is only at offset 10 if "extra field" absent # * Produce shorter output - notably, only report compression methods -# other than 8 ("deflate", the only method defined in RFC 1952). +# other than 8 ("deflate", the only method defined in RFC 1952). 0 string \037\213 gzip compressed data !:mime application/x-gzip >2 byte <8 \b, reserved method @@ -183,21 +182,21 @@ >4 belong 0x090A0C0D best compression # 7-zip archiver, from Thomas Klausner (wiz@danbala.tuwien.ac.at) -# http://www.7-zip.org or DOC/7zFormat.txt +# http://www.7-zip.org or DOC/7zFormat.txt # 0 string 7z\274\257\047\034 7-zip archive data, >6 byte x version %d >7 byte x \b.%d +!:mime application/x-7z-compressed # Type: LZMA -# URL: http://www.7-zip.org/sdk.html -# From: Robert Millan and Reuben Thomas -# Commented out because apparently not reliable (according to Debian -# bug #364260) -#0 string ]\000\000\200\000 LZMA compressed data +0 lelong 0x8000005d LZMA compressed data, +>5 lequad =0xffffffffffffffff streamed +>5 lequad !0xffffffffffffffff non-streamed, size %lld +!:mime application/x-lzma # http://tukaani.org/xz/xz-file-format.txt -0 ustring \xFD7zXZ\x00 xz compressed data +0 ustring \xFD7zXZ\x00 XZ compressed data !:mime application/x-xz # AFX compressed files (Wolfram Kleff) @@ -218,3 +217,13 @@ # URL: http://tukaani.org/xz/ 0 string \xfd\x37\x7a\x58\x5a\x00 XZ compressed data !:mime application/x-xz + +0 string ArC\x01 FreeArc archive + +# Type: DACT compressed files +0 long 0x444354C3 DACT compressed data +>4 byte >-1 (version %i. +>5 byte >-1 %i. +>6 byte >-1 %i) +>7 long >0 , original size: %i bytes +>15 long >30 , block size: %i bytes diff --git a/contrib/file/magic/Magdir/console b/contrib/file/magic/Magdir/console index f12af6db76..573dd15fd5 100644 --- a/contrib/file/magic/Magdir/console +++ b/contrib/file/magic/Magdir/console @@ -1,6 +1,6 @@ #------------------------------------------------------------------------------ -# $File: console,v 1.16 2009/09/19 16:28:08 christos Exp $ +# $File: console,v 1.18 2010/09/20 19:19:17 rrt Exp $ # Console game magic # Toby Deshane # ines: file(1) magic for Marat's iNES Nintendo Entertainment System @@ -165,9 +165,13 @@ # Atari Lynx cartridge dump (EXE/BLL header) # From: "Stefan A. Haubenthal" -0 beshort 0x8008 Lynx cartridge, ->2 beshort x RAM start $%04x ->6 string BS93 +# Double-check that the image type matches too, 0x8008 conflicts with +# 8 character OMF-86 object file headers. +0 beshort 0x8008 +>6 string BS93 Lynx homebrew cartridge +>>2 beshort x \b, RAM start $%04x +>6 string LYNX Lynx cartridge +>>2 beshort x \b, RAM start $%04x # Opera file system that is used on the 3DO console # From: Serge van den Boom @@ -254,3 +258,7 @@ >>>(0x18.l-26) lelong x CRC32 0x%08x >>>(0x18.l-23) string x "%s" +# Type: scummVM savegame files +# From: Sven Hartge +0 string SCVM ScummVM savegame +>12 string >\0 "%s" diff --git a/contrib/file/magic/Magdir/database b/contrib/file/magic/Magdir/database index c4a03f41d5..02399746cd 100644 --- a/contrib/file/magic/Magdir/database +++ b/contrib/file/magic/Magdir/database @@ -1,6 +1,6 @@ #------------------------------------------------------------------------------ -# $File: database,v 1.24 2009/09/19 16:28:08 christos Exp $ +# $File: database,v 1.26 2010/12/26 23:10:59 christos Exp $ # database: file(1) magic for various databases # # extracted from header/code files by Graeme Wilford (eep2gw@ee.surrey.ac.uk) @@ -268,3 +268,22 @@ >40 lequad x \b, bnum=%lld >48 lequad x \b, rnum=%lld >56 lequad x \b, fsiz=%lld + +# G-IR database made by gobject-introspect toolset, +# http://live.gnome.org/GObjectIntrospection +0 string GOBJ\nMETADATA\r\n\032 G-IR binary database +>16 byte x \b, v%d +>17 byte x \b.%d +>20 leshort x \b, %d entries +>22 leshort x \b/%d local + +# Type: QDBM Quick Database Manager +# From: Benoit Sibaud +0 string \\[depot\\]\n\f Quick Database Manager, little endian +0 string \\[DEPOT\\]\n\f Quick Database Manager, big endian + +# Type: TokyoCabinet database +# URL: http://tokyocabinet.sourceforge.net/ +# From: Benoit Sibaud +0 string ToKyO\ CaBiNeT\n TokyoCabinet database +>14 string x (version %s) diff --git a/contrib/file/magic/Magdir/diff b/contrib/file/magic/Magdir/diff index 0992caa9c4..73f0135df8 100644 --- a/contrib/file/magic/Magdir/diff +++ b/contrib/file/magic/Magdir/diff @@ -1,6 +1,6 @@ #------------------------------------------------------------------------------ -# $File: diff,v 1.10 2009/09/19 16:28:08 christos Exp $ +# $File: diff,v 1.12 2010/12/07 16:52:52 christos Exp $ # diff: file(1) magic for diff(1) output # 0 search/1 diff\ diff output text @@ -16,4 +16,14 @@ !:mime text/x-diff # bsdiff: file(1) magic for bsdiff(1) output -0 string BSDIFF40 bsdiff(1) patch file +0 string/t BSDIFF40 bsdiff(1) patch file + + +# unified diff +0 search/4096 ---\ +>&0 search/1024 \n +>>&0 search/1 +++\ +>>>&0 search/1024 \n +>>>>&0 search/1 @@ unified diff output text +!:mime text/x-diff +!:strength + 90 diff --git a/contrib/file/magic/Magdir/dyadic b/contrib/file/magic/Magdir/dyadic index aa10c2e4e8..c1a2c3c53e 100644 --- a/contrib/file/magic/Magdir/dyadic +++ b/contrib/file/magic/Magdir/dyadic @@ -1,9 +1,9 @@ #------------------------------------------------------------------------------ -# $File: dyadic,v 1.4 2009/09/19 16:28:09 christos Exp $ +# $File: dyadic,v 1.5 2010/09/20 18:55:20 rrt Exp $ # Dyadic: file(1) magic for Dyalog APL. # -0 byte 0xaa +0 byte 0xaa >1 byte <4 Dyalog APL >>1 byte 0x00 incomplete workspace >>1 byte 0x01 component file @@ -11,3 +11,36 @@ >>1 byte 0x03 workspace >>2 byte x version %d >>3 byte x .%d + +0 beshort 0xaa03 Dyalog APL +>2 byte x workspace type %d +>3 byte x subtype %d +>7 byte&0x28 0x00 32-bit +>7 byte&0x28 0x20 64-bit +>7 byte&0x0c 0x00 classic +>7 byte&0x0c 0x04 unicode +>7 byte&0x88 0x00 big-endian +>7 byte&0x88 0x80 little-endian + +0 byte 0xaa Dyalog APL +>1 byte 0x00 aplcore +>1 byte 0x01 component file 32-bit non-journaled non-checksummed +>1 byte 0x02 external variable exclusive +>1 byte 0x06 external variable shared +>1 byte 0x07 session +>1 byte 0x08 mapped file 32-bit +>1 byte 0x09 component file 64-bit non-journaled non-checksummed +>1 byte 0x0a mapped file 64-bit +>1 byte 0x0b component file 32-bit level 1 journaled non-checksummed +>1 byte 0x0c component file 64-bit level 1 journaled non-checksummed +>1 byte 0x0d component file 32-bit level 1 journaled checksummed +>1 byte 0x0e component file 64-bit level 1 journaled checksummed +>1 byte 0x0f component file 32-bit level 2 journaled checksummed +>1 byte 0x10 component file 64-bit level 2 journaled checksummed +>1 byte 0x11 component file 32-bit level 3 journaled checksummed +>1 byte 0x12 component file 64-bit level 3 journaled checksummed +>1 byte 0x13 component file 32-bit non-journaled checksummed +>1 byte 0x14 component file 64-bit non-journaled checksummed +>1 byte 0x80 DDB + +0 short 0x6060 Dyalog APL transfer diff --git a/contrib/file/magic/Magdir/ebml b/contrib/file/magic/Magdir/ebml new file mode 100644 index 0000000000..d5d174329a --- /dev/null +++ b/contrib/file/magic/Magdir/ebml @@ -0,0 +1,8 @@ + +#------------------------------------------------------------------------------ +# $File: ebml,v 1.1 2010/07/02 00:07:03 christos Exp $ +# ebml: file(1) magic for various Extensible Binary Meta Language +# http://www.matroska.org/technical/specs/index.html#track +0 belong 0x1a45dfa3 EBML file +>4 search/b/100 \102\202 +>>&1 string x \b, creator %.8s diff --git a/contrib/file/magic/Magdir/erlang b/contrib/file/magic/Magdir/erlang index 4686868767..b604a06828 100644 --- a/contrib/file/magic/Magdir/erlang +++ b/contrib/file/magic/Magdir/erlang @@ -1,6 +1,6 @@ #------------------------------------------------------------------------------ -# $File: erlang,v 1.5 2009/09/19 16:28:09 christos Exp $ +# $File: erlang,v 1.6 2010/09/20 19:19:17 rrt Exp $ # erlang: file(1) magic for Erlang JAM and BEAM files # URL: http://www.erlang.org/faq/x779.html#AEN812 @@ -17,3 +17,5 @@ 79 string Tue\ Jan\ 22\ 14:32:44\ MET\ 1991 Erlang JAM file - version 4.2 4 string 1.0\ Fri\ Feb\ 3\ 09:55:56\ MET\ 1995 Erlang JAM file - version 4.3 + +0 bequad 0x0000000000ABCDEF Erlang DETS file diff --git a/contrib/file/magic/Magdir/filesystems b/contrib/file/magic/Magdir/filesystems index 8d60f69fd6..af9695b9a4 100644 --- a/contrib/file/magic/Magdir/filesystems +++ b/contrib/file/magic/Magdir/filesystems @@ -1,6 +1,6 @@ #------------------------------------------------------------------------------ -# $File: filesystems,v 1.55 2010/01/16 17:45:12 chl Exp $ +# $File: filesystems,v 1.61 2011/01/10 14:01:10 christos Exp $ # filesystems: file(1) magic for different filesystems # 0 string \366\366\366\366 PC formatted floppy with no filesystem @@ -884,15 +884,19 @@ # Minix filesystems - Juan Cespedes 0x410 leshort 0x137f +!:strength / 2 >0x402 beshort < 100 Minix filesystem, V1, %d zones >0x1e string minix \b, bootable 0x410 beshort 0x137f +!:strength / 2 >0x402 beshort < 100 Minix filesystem, V1 (big endian), %d zones >0x1e string minix \b, bootable 0x410 leshort 0x138f +!:strength / 2 >0x402 beshort < 100 Minix filesystem, V1, 30 char names, %d zones >0x1e string minix \b, bootable 0x410 beshort 0x138f +!:strength / 2 >0x402 beshort < 100 Minix filesystem, V1, 30 char names (big endian), %d zones >0x1e string minix \b, bootable 0x410 leshort 0x2468 @@ -1109,6 +1113,8 @@ # ext2/ext3 filesystems - Andreas Dilger # ext4 filesystem - Eric Sandeen +# volume label and UUID Russell Coker +# http://etbe.coker.com.au/2008/07/08/label-vs-uuid-vs-device/ 0x438 leshort 0xEF53 Linux >0x44c lelong x rev %d >0x43e leshort x \b.%d @@ -1124,25 +1130,32 @@ # else large RO_COMPAT? >>>0x464 lelong >0x0000007 ext4 filesystem data # else large INCOMPAT? ->>0x460 lelong >0x000003f ext4 filesystem data +>>0x460 lelong >0x000003f ext4 filesystem data +>0x468 belong x \b, UUID=%08x +>0x46c beshort x \b-%04x +>0x46e beshort x \b-%04x +>0x470 beshort x \b-%04x +>0x472 belong x \b-%08x +>0x476 beshort x \b%04x +>0x478 string >0 \b, volume name "%s" # General flags for any ext* fs ->0x460 lelong &0x0000004 (needs journal recovery) ->0x43a leshort &0x0000002 (errors) +>0x460 lelong &0x0000004 (needs journal recovery) +>0x43a leshort &0x0000002 (errors) # INCOMPAT flags ->0x460 lelong &0x0000001 (compressed) -#>0x460 lelong &0x0000002 (filetype) -#>0x460 lelong &0x0000010 (meta bg) ->0x460 lelong &0x0000040 (extents) ->0x460 lelong &0x0000080 (64bit) -#>0x460 lelong &0x0000100 (mmp) -#>0x460 lelong &0x0000200 (flex bg) +>0x460 lelong &0x0000001 (compressed) +#>0x460 lelong &0x0000002 (filetype) +#>0x460 lelong &0x0000010 (meta bg) +>0x460 lelong &0x0000040 (extents) +>0x460 lelong &0x0000080 (64bit) +#>0x460 lelong &0x0000100 (mmp) +#>0x460 lelong &0x0000200 (flex bg) # RO_INCOMPAT flags -#>0x464 lelong &0x0000001 (sparse super) ->0x464 lelong &0x0000002 (large files) ->0x464 lelong &0x0000008 (huge files) -#>0x464 lelong &0x0000010 (gdt checksum) -#>0x464 lelong &0x0000020 (many subdirs) -#>0x463 lelong &0x0000040 (extra isize) +#>0x464 lelong &0x0000001 (sparse super) +>0x464 lelong &0x0000002 (large files) +>0x464 lelong &0x0000008 (huge files) +#>0x464 lelong &0x0000010 (gdt checksum) +#>0x464 lelong &0x0000020 (many subdirs) +#>0x463 lelong &0x0000040 (extra isize) # SGI disk labels - Nathan Scott 0 belong 0x0BE5A941 SGI disk label (volume header) @@ -1220,7 +1233,7 @@ # CDROM Filesystems # Modified for UDF by gerardo.cacciari@gmail.com -32769 string CD001 +32769 string CD001 # !:mime application/x-iso9660-image >38913 string !NSR0 ISO 9660 CD-ROM filesystem data >38913 string NSR0 UDF filesystem data @@ -1263,6 +1276,7 @@ # reiserfs - russell@coker.com.au 0x10034 string ReIsErFs ReiserFS V3.5 0x10034 string ReIsEr2Fs ReiserFS V3.6 +0x10034 string ReIsEr3Fs ReiserFS V3.6.19 >0x1002c leshort x block size %d >0x10032 leshort &2 (mounted or unclean) >0x10000 lelong x num blocks %d @@ -1359,28 +1373,46 @@ >28 beshort <3 >>8 belong x %d bytes, >28 beshort >2 ->>63 bequad x %lld bytes, +>>28 beshort <4 +>>>63 bequad x %lld bytes, +>>28 beshort >3 +>>>40 bequad x %lld bytes, #>>67 belong x %d bytes, >4 belong x %d inodes, >28 beshort <2 >>32 beshort x blocksize: %d bytes, >28 beshort >1 ->>51 belong x blocksize: %d bytes, ->39 bedate x created: %s +>>28 beshort <4 +>>>51 belong x blocksize: %d bytes, +>>28 beshort >3 +>>>12 belong x blocksize: %d bytes, +>28 beshort <4 +>>39 bedate x created: %s +>28 beshort >3 +>>8 bedate x created: %s 0 string hsqs Squashfs filesystem, little endian, >28 leshort x version %d. >30 leshort x \b%d, >28 leshort <3 >>8 lelong x %d bytes, >28 leshort >2 ->>63 lequad x %lld bytes, +>>28 leshort <4 +>>>63 lequad x %lld bytes, +>>28 leshort >3 +>>>40 lequad x %lld bytes, #>>63 lelong x %d bytes, >4 lelong x %d inodes, >28 leshort <2 >>32 leshort x blocksize: %d bytes, >28 leshort >1 ->>51 lelong x blocksize: %d bytes, ->39 ledate x created: %s +>>28 leshort <4 +>>>51 lelong x blocksize: %d bytes, +>>28 leshort >3 +>>>12 lelong x blocksize: %d bytes, +>28 leshort <4 +>>39 ledate x created: %s +>28 leshort >3 +>>8 ledate x created: %s 0 string td\000 floppy image data (TeleDisk) @@ -1447,13 +1479,17 @@ 0 string CPQRFBLO Compaq/HP RILOE floppy image #------------------------------------------------------------------------------ -# Files-11 On-Disk Structure (OpenVMS file system) - gerardo.cacciari@gmail.com -# These bits come from LBN 1 (home block) of ODS-2 and ODS-5 volumes, which is -# mapped to VBN 2 of [000000]INDEXF.SYS;1 +# Files-11 On-Disk Structure (File system for various RSX-11 and VMS flavours). +# These bits come from LBN 1 (home block) of ODS-1, ODS-2 and ODS-5 volumes, +# which is mapped to VBN 2 of [000000]INDEXF.SYS;1 - gerardo.cacciari@gmail.com # -1008 string DECFILE11B Files-11 On-Disk Structure +1008 string DECFILE11 Files-11 On-Disk Structure >525 byte x Level %d ->525 byte x (ODS-%d OpenVMS file system), +>525 byte x (ODS-%d); +>1017 string A RSX-11, VAX/VMS or OpenVMS VAX file system; +>1017 string B +>>525 byte 2 VAX/VMS or OpenVMS file system; +>>525 byte 5 OpenVMS Alpha or Itanium file system; >984 string x volume label is '%-12.12s' # From: Thomas Klausner @@ -1468,9 +1504,13 @@ # From Eric Sandeen # GFS2 -0x10000 belong 0x01161970 GFS2 Filesystem ->0x10024 belong x (blocksize %d, ->0x10060 string >\0 lockproto %s) +0x10000 belong 0x01161970 +>0x10018 belong 0x0000051d GFS1 Filesystem +>>0x10024 belong x (blocksize %d, +>>0x10060 string >\0 lockproto %s) +>0x10018 belong 0x00000709 GFS2 Filesystem +>>0x10024 belong x (blocksize %d, +>>0x10060 string >\0 lockproto %s) # BTRFS 0x10040 string _BHRfS_M BTRFS Filesystem @@ -1490,3 +1530,42 @@ 0 string XFSM >0x200 string XFSB XFS filesystem metadump image +# Type: CROM filesystem +# From: Werner Fink +0 string CROMFS CROMFS +>6 string >\0 \b version %2.2s, +>8 ulequad >0 \b block data at %lld, +>16 ulequad >0 \b fblock table at %lld, +>24 ulequad >0 \b inode table at %lld, +>32 ulequad >0 \b root at %lld, +>40 ulelong >0 \b fblock size = %ld, +>44 ulelong >0 \b block size = %ld, +>48 ulequad >0 \b bytes = %lld + +# Type: xfs metadump image +# From: Daniel Novotny +# mb_magic XFSM at 0; superblock magic XFSB at 1 << mb_blocklog +# but can we do the << ? For now it's always 512 (0x200) anyway. +0 string XFSM +>0x200 string XFSB XFS filesystem metadump image + +# Type: delta ISO +# From: Daniel Novotny +0 string DISO Delta ISO data, +>4 belong x version %d + +# JFS2 (Journaling File System) image. (Old JFS1 has superblock at 0x1000.) +# See linux/fs/jfs/jfs_superblock.h for layout; see jfs_filsys.h for flags. +# From: Adam Buchbinder +0x8000 string JFS1 +# Because it's text-only magic, check a binary value (version) to be sure. +# Should always be 2, but mkfs.jfs writes it as 1. Needs to be 2 or 1 to be +# mountable. +>&0 lelong <3 JFS2 filesystem image +# Label is followed by a UUID; we have to limit string length to avoid +# appending the UUID in the case of a 16-byte label. +>>&144 regex [\x20-\x7E]{1,16} (label "%s") +>>&0 lequad x \b, %lld blocks +>>&8 lelong x \b, blocksize %d +>>&32 lelong&0x00000006 >0 (dirty) +>>&36 lelong >0 (compressed) diff --git a/contrib/file/magic/Magdir/fonts b/contrib/file/magic/Magdir/fonts index 3c5d7441c4..917d372e6e 100644 --- a/contrib/file/magic/Magdir/fonts +++ b/contrib/file/magic/Magdir/fonts @@ -1,6 +1,6 @@ #------------------------------------------------------------------------------ -# $File: fonts,v 1.21 2009/12/06 23:17:52 rrt Exp $ +# $File: fonts,v 1.23 2010/09/20 18:55:20 rrt Exp $ # fonts: file(1) magic for font data # 0 search/1 FONT ASCII vfont text @@ -61,7 +61,15 @@ 0 string \007\001\001\000Copyright\ (c)\ 199 Adobe Multiple Master font 0 string \012\001\001\000Copyright\ (c)\ 199 Adobe Multiple Master font +# TrueType/OpenType font collections (.ttc) +# http://www.microsoft.com/typography/otspec/otff.htm 0 string ttcf TrueType font collection data +>4 belong 0x00010000 \b, 1.0 +>>8 belong >0 \b, %d fonts +>4 belong 0x00020000 \b, 2.0 +>>8 belong >0 \b, %d fonts +# 0x44454947 = 'DSIG' +>>>16 belong 0x44534947 \b, digitally signed # Opentype font data from Avi Bercovich 0 string OTTO OpenType font data @@ -71,3 +79,7 @@ 0 string SplineFontDB: Spline Font Database !:mime application/vnd.font-fontforge-sfd >14 string x version %s + +# EOT +34 string LP Embedded OpenType (EOT) +!:mime application/vnd.ms-fontobject diff --git a/contrib/file/magic/Magdir/games b/contrib/file/magic/Magdir/games index 782bfe93d7..3bd13f1030 100644 --- a/contrib/file/magic/Magdir/games +++ b/contrib/file/magic/Magdir/games @@ -1,6 +1,6 @@ #------------------------------------------------------------------------------ -# $File: games,v 1.8 2009/09/19 16:28:09 christos Exp $ +# $File: games,v 1.12 2010/11/25 15:00:12 christos Exp $ # games: file(1) for games # Fabio Bonelli @@ -35,6 +35,7 @@ # Quake 0 string PACK Quake I or II world or extension +>8 lelong >0 \b, %d entries #0 string -1\x0a Quake I demo #>30 string x version %.4s @@ -154,6 +155,11 @@ 0 string =PWAD doom patch PWAD data >4 lelong x containing %d lumps +# Build engine group files (Duke Nukem, Shadow Warrior, ...) +# Extension: .grp +# Created by: "Ganael Laplanche" +0 string KenSilverman Build engine group file +>12 lelong x containing %d files # Summary: Warcraft 3 save # Extension: .w3g @@ -174,7 +180,7 @@ # Modified by (1): Abel Cheung (regex, more game format) # FIXME: Some games don't have GM (game type) 0 regex \\(;.*GM\\[[0-9]{1,2}\\] Smart Game Format ->2 search/0x200 GM[ +>2 search/0x200/b GM[ >>&0 string 1] (Go) >>&0 string 2] (Othello) >>&0 string 3] (chess) @@ -216,13 +222,6 @@ >>&0 string 39] (Gipf) >>&0 string 40] (Kropki) - -# Summary: Civilization 4 video -# Extension: .bik -# Created by: Abel Cheung -0 string BIKi Civilization 4 Video - - ############################################## # NetImmerse/Gamebryo game engine entries @@ -245,3 +244,14 @@ >&0 string n\ NetImmerse game engine file >>&0 regex [0-9a-z.]+ \b, version %s +# Type: SGF Smart Game Format +# URL: http://www.red-bean.com/sgf/ +# From: Eduardo Sabbatella +2 regex/c \\(;.*GM\\[[0-9]{1,2}\\] Smart Game Format +>2 regex/c GM\\[1\\] - Go Game +>2 regex/c GM\\[6\\] - BackGammon Game +>2 regex/c GM\\[11\\] - Hex Game +>2 regex/c GM\\[18\\] - Amazons Game +>2 regex/c GM\\[19\\] - Octi Game +>2 regex/c GM\\[20\\] - Gess Game +>2 regex/c GM\\[21\\] - twix Game diff --git a/contrib/file/magic/Magdir/geo b/contrib/file/magic/Magdir/geo new file mode 100644 index 0000000000..924c71e935 --- /dev/null +++ b/contrib/file/magic/Magdir/geo @@ -0,0 +1,105 @@ + +#------------------------------------------------------------------------------ +# $File: geo,v 1.1 2010/02/23 23:40:07 christos Exp $ +# Geo- files from Kurt Schwehr + +###################################################################### +# +# Acoustic Doppler Current Profilers (ADCP) +# +###################################################################### + +0 beshort 0x7f7f RDI Acoustic Doppler Current Profiler (ADCP) + +###################################################################### +# +# Metadata +# +###################################################################### + +0 string Identification_Information FGDC ASCII metadata + +###################################################################### +# +# Seimsic / Subbottom +# +###################################################################### + +# Knudsen subbottom chirp profiler - Binary File Format: B9 +# KEB D409-03167 V1.75 Huffman +0 string KEB\ Knudsen seismic KEL binary (KEB) - +>4 regex [-A-Z0-9]* Software: %s +>>&1 regex V[0-9]*\.[0-9]* version %s + +###################################################################### +# +# LIDAR - Laser altimetry or bathy +# +###################################################################### + + +# Caris LIDAR format for LADS comes as two parts... ascii location file and binary waveform data +0 string HCA LADS Caris Ascii Format (CAF) bathymetric lidar +>4 regex [0-9]*\.[0-9]* version %s + +0 string HCB LADS Caris Binary Format (CBF) bathymetric lidar waveform data +>3 byte x version %d . +>4 byte x %d + + +###################################################################### +# +# MULTIBEAM SONARS http://www.ldeo.columbia.edu/res/pi/MB-System/formatdoc/ +# +###################################################################### + +# GeoAcoustics - GeoSwath Plus +4 beshort 0x2002 GeoSwath RDF +0 string Start:- GeoSwatch auf text file + +# Seabeam 2100 +# mbsystem code mb41 +0 string SB2100 SeaBeam 2100 multibeam sonar +0 string SB2100DR SeaBeam 2100 DR multibeam sonar +0 string SB2100PR SeaBeam 2100 PR multibeam sonar + +# This corresponds to MB-System format 94, L-3/ELAC/SeaBeam XSE vendor +# format. It is the format of our upgraded SeaBeam 2112 on R/V KNORR. +0 string $HSF XSE multibeam + +# mb121 http://www.saic.com/maritime/gsf/ +8 string GSF-v SAIC generic sensor format (GSF) sonar data, +>&0 regex [0-9]*\.[0-9]* version %s + +# MGD77 - http://www.ngdc.noaa.gov/mgg/dat/geodas/docs/mgd77.htm +# mb161 +9 string MGD77 MGD77 Header, Marine Geophysical Data Exchange Format + +# MBSystem processing caches the mbinfo output +1 string Swath\ Data\ File: mbsystem info cache + +# Caris John Hughes Clark format +0 string HDCS Caris multibeam sonar related data +1 string Start/Stop\ parameter\ header: Caris ASCII project summary + +###################################################################### +# +# Visualization and 3D modeling +# +###################################################################### + +# IVS - IVS3d.com Tagged Data Represetation +0 string %%\ TDR\ 2.0 IVS Fledermaus TDR file + +# http://www.ecma-international.org/publications/standards/Ecma-363.htm +# 3D in PDFs +0 string U3D ECMA-363, Universal 3D + +###################################################################### +# +# Support files +# +###################################################################### + +# https://midas.psi.ch/elog/ +0 string $@MID@$ elog journal entry diff --git a/contrib/file/magic/Magdir/gimp b/contrib/file/magic/Magdir/gimp index 4fc65210a5..a360bd8e12 100644 --- a/contrib/file/magic/Magdir/gimp +++ b/contrib/file/magic/Magdir/gimp @@ -1,6 +1,6 @@ #------------------------------------------------------------------------------ -# $File: gimp,v 1.6 2009/09/19 16:28:09 christos Exp $ +# $File: gimp,v 1.7 2010/09/20 18:55:20 rrt Exp $ # GIMP Gradient: file(1) magic for the GIMP's gradient data files # by Federico Mena @@ -12,6 +12,7 @@ # ('Bucky' LaDieu, nega@vt.edu) 0 string gimp\ xcf GIMP XCF image data, +!:mime image/x-xcf >9 string file version 0, >9 string v version >>10 string >\0 %s, diff --git a/contrib/file/magic/Magdir/images b/contrib/file/magic/Magdir/images index 7586ad8e76..d4e2e752b4 100644 --- a/contrib/file/magic/Magdir/images +++ b/contrib/file/magic/Magdir/images @@ -1,6 +1,6 @@ #------------------------------------------------------------------------------ -# $File: images,v 1.64 2009/12/06 00:38:50 christos Exp $ +# $File: images,v 1.70 2010/11/25 15:00:12 christos Exp $ # images: file(1) magic for image formats (see also "iff", and "c-lang" for # XPM bitmaps) # @@ -34,7 +34,7 @@ # The next byte following the magic is always whitespace. 0 search/1 P1 Netpbm PBM image text !:mime image/x-portable-bitmap -0 search/1 P2 Netpbm PGM image text +0 search/1b P2 Netpbm PGM image text !:mime image/x-portable-greymap 0 search/1 P3 Netpbm PPM image text !:mime image/x-portable-pixmap @@ -63,6 +63,25 @@ 0 string IIN1 NIFF image data !:mime image/x-niff +# Canon RAW version 1 (CRW) files are a type of Canon Image File Format +# (CIFF) file. These are apparently all little-endian. +# From: Adam Buchbinder +# URL: http://www.sno.phy.queensu.ca/~phil/exiftool/canon_raw.html +0 string II\x1a\0\0\0HEAPCCDR Canon CIFF raw image data +!:mime image/x-canon-crw +>16 leshort x \b, version %d. +>14 leshort x \b%d + +# Canon RAW version 2 (CR2) files are a kind of TIFF with an extra magic +# number. Put this above the TIFF test to make sure we detect them. +# These are apparently all little-endian. +# From: Adam Buchbinder +# URL: http://libopenraw.freedesktop.org/wiki/Canon_CR2 +0 string II\x2a\0\x10\0\0\0CR Canon CR2 raw image data +!:mime image/x-canon-cr2 +>10 byte x \b, version %d. +>11 byte x \b%d + # Tag Image File Format, from Daniel Quinlan (quinlan@yggdrasil.com) # The second word of TIFF files is the TIFF version number, 42, which has # never changed. The TIFF specification recommends testing for it. @@ -310,11 +329,20 @@ # As described in /usr/X11R6/include/X11/XWDFile.h # used by the xwd program. # Bradford Castalia, idaeim, 1/01 -4 belong 7 XWD X Window Dump image data ->100 string >\0 \b, "%s" ->16 belong x \b, %dx ->20 belong x \b%dx ->12 belong x \b%d +# updated by Adam Buchbinder, 2/09 +# The following assumes version 7 of the format; the first long is the length +# of the header, which is at least 25 4-byte longs, and the one at offset 8 +# is a constant which is always either 1 or 2. Offset 12 is the pixmap depth, +# which is a maximum of 32. +0 belong >100 +>8 belong <3 +>>12 belong <33 +>>>4 belong 7 XWD X Window Dump image data +!:mime image/x-xwindowdump +>>>>100 string >\0 \b, "%s" +>>>>16 belong x \b, %dx +>>>>20 belong x \b%dx +>>>>12 belong x \b%d # PDS - Planetary Data System # These files use Parameter Value Language in the header section. @@ -551,11 +579,16 @@ # Bio-Rad .PIC is an image format used by microscope control systems # and related image processing software used by biologists. # From: Vebjorn Ljosa -54 leshort 12345 Bio-Rad .PIC Image File ->0 leshort >0 %hd x ->2 leshort >0 %hd, ->4 leshort =1 1 image in file ->4 leshort >1 %hd images in file +# BOOL values are two-byte integers; use them to rule out false positives. +# http://web.archive.org/web/20050317223257/www.cs.ubc.ca/spider/ladic/text/biorad.txt +# Samples: http://www.loci.wisc.edu/software/sample-data +14 leshort <2 +>62 leshort <2 +>>54 leshort 12345 Bio-Rad .PIC Image File +>>>0 leshort >0 %hd x +>>>2 leshort >0 %hd, +>>>4 leshort =1 1 image in file +>>>4 leshort >1 %hd images in file # From Jan "Yenya" Kasprzak # The description of *.mrw format can be found at @@ -590,7 +623,7 @@ # specifications at http://hdf.ncsa.uiuc.edu/ 0 belong 0x0e031301 Hierarchical Data Format (version 4) data !:mime application/x-hdf -0 string \211HDF\r\n\032 Hierarchical Data Format (version 5) data +0 string \211HDF\r\n\032\n Hierarchical Data Format (version 5) data !:mime application/x-hdf # From: Tobias Burnus @@ -630,3 +663,68 @@ # JPEG 2000 Code Stream Bitmap # From Petr Splichal 0 string \xFF\x4F\xFF\x51\x00 JPEG-2000 Code Stream Bitmap data + +# From: Rick Richardson +0 string GARMIN\ BITMAP\ 01 Garmin Bitmap file + +# Type: Ulead Photo Explorer5 (.pe5) +# URL: http://www.jisyo.com/cgibin/view.cgi?EXT=pe5 (Japanese) +# From: Simon Horman +0 string IIO2H Ulead Photo Explorer5 + +# Type: X11 cursor +# URL: http://webcvs.freedesktop.org/mime/shared-mime-info/freedesktop.org.xml.in?view=markup +# From: Mathias Brodala +0 string Xcur X11 cursor + +# Type: Olympus ORF raw images. +# URL: http://libopenraw.freedesktop.org/wiki/Olympus_ORF +# From: Adam Buchbinder +0 string MMOR Olympus ORF raw image data, big-endian +!:mime image/x-olympus-orf +0 string IIRO Olympus ORF raw image data, little-endian +!:mime image/x-olympus-orf +0 string IIRS Olympus ORF raw image data, little-endian +!:mime image/x-olympus-orf + +# Type: files used in modern AVCHD camcoders to store clip information +# Extension: .cpi +# From: Alexander Danilov +0 string HDMV0100 AVCHD Clip Information + +# From: Adam Buchbinder +# URL: http://local.wasp.uwa.edu.au/~pbourke/dataformats/pic/ +# Radiance HDR; usually has .pic or .hdr extension. +0 string #?RADIANCE\n Radiance HDR image data +#!mime image/vnd.radiance + +# From: Adam Buchbinder +# URL: http://www.mpi-inf.mpg.de/resources/pfstools/pfs_format_spec.pdf +# Used by the pfstools packages. The regex matches for the image size could +# probably use some work. The MIME type is made up; if there's one in +# actual common use, it should replace the one below. +0 string PFS1\x0a PFS HDR image data +#!mime image/x-pfs +>1 regex [0-9]*\ \b, %s +>>1 regex \ [0-9]{4} \bx%s + +# Type: Foveon X3F +# URL: http://www.photofo.com/downloads/x3f-raw-format.pdf +# From: Adam Buchbinder +# Note that the MIME type isn't defined anywhere that I can find; if +# there's a canonical type for this format, it should replace this one. +0 string FOVb Foveon X3F raw image data +!:mime image/x-x3f +>6 leshort x \b, version %d. +>4 leshort x \b%d +>28 lelong x \b, %dx +>32 lelong x \b%d + +# Paint.NET file +# From Adam Buchbinder +0 string PDN3 Paint.NET image data +!:mime image/x-paintnet + +# Not really an image. +# From: "Tano M. Fotang" +0 string \x46\x4d\x52\x00 ISO/IEC 19794-2 Format Minutiae Record (FMR) diff --git a/contrib/file/magic/Magdir/isz b/contrib/file/magic/Magdir/isz new file mode 100644 index 0000000000..316bbd4acf --- /dev/null +++ b/contrib/file/magic/Magdir/isz @@ -0,0 +1,15 @@ + +#------------------------------------------------------------------------------ +# $File: isz,v 1.1 2010/03/27 16:17:09 christos Exp $ +# ISO Zipped file format +# http://www.ezbsystems.com/isz/iszspec.txt +0 string IsZ! ISO Zipped file +>4 byte x \b, header size %u +>5 byte x \b, version %u +>8 lelong x \b, serial %u +#12 leshort x \b, sector size %u +#>16 lelong x \b, total sectors %u +>17 byte >0 \b, password protected +#>24 lequad x \b, segment size %llu +#>32 lelong x \b, blocks %u +#>36 lelong x \b, block size %u diff --git a/contrib/file/magic/Magdir/jpeg b/contrib/file/magic/Magdir/jpeg index dbf8be7987..7814245add 100644 --- a/contrib/file/magic/Magdir/jpeg +++ b/contrib/file/magic/Magdir/jpeg @@ -1,6 +1,6 @@ #------------------------------------------------------------------------------ -# $File: jpeg,v 1.15 2009/09/19 16:28:10 christos Exp $ +# $File: jpeg,v 1.16 2011/01/04 19:29:32 rrt Exp $ # JPEG images # SunOS 5.5.1 had # @@ -127,13 +127,8 @@ # And if there was some sort of looping construct to do searches, plus a few # named accumulators, it would be even more effective... # At least we can show a comment if no other segments got inserted before: ->(4.S+5) byte 0xFE ->>(4.S+8) string >\0 \b, comment: "%s" -# FIXME: When we can do non-byte counted strings, we can use that to get -# the string's count, and fix Debian bug #283760 -#>(4.S+5) byte 0xFE \b, comment -#>>(4.S+6) beshort x \b length=%d -#>>(4.S+8) string >\0 \b, "%s" +>(4.S+5) byte 0xFE \b, comment: +>>(4.S+6) pstring/HJ x "%s" # Or, we can show the encoding type (I've included only the three most common) # and image dimensions if we are lucky and the SOFn (image segment) is here: >(4.S+5) byte 0xC0 \b, baseline diff --git a/contrib/file/magic/Magdir/kde b/contrib/file/magic/Magdir/kde index f8a1e844c5..dda5819a9b 100644 --- a/contrib/file/magic/Magdir/kde +++ b/contrib/file/magic/Magdir/kde @@ -1,11 +1,11 @@ #------------------------------------------------------------------------------ -# $File: kde,v 1.4 2009/09/19 16:28:10 christos Exp $ +# $File: kde,v 1.5 2010/11/25 15:00:12 christos Exp $ # kde: file(1) magic for KDE -0 string [KDE\ Desktop\ Entry] KDE desktop entry +0 string/t [KDE\ Desktop\ Entry] KDE desktop entry !:mime application/x-kdelnk -0 string #\ KDE\ Config\ File KDE config file +0 string/t #\ KDE\ Config\ File KDE config file !:mime application/x-kdelnk -0 string #\ xmcd xmcd database file for kscd +0 string/t #\ xmcd xmcd database file for kscd !:mime text/x-xmcd diff --git a/contrib/file/magic/Magdir/kml b/contrib/file/magic/Magdir/kml index 5770101dd3..ed0f42ed85 100644 --- a/contrib/file/magic/Magdir/kml +++ b/contrib/file/magic/Magdir/kml @@ -1,12 +1,12 @@ #------------------------------------------------------------------------------ -# $File: kml,v 1.2 2009/09/19 16:28:10 christos Exp $ +# $File: kml,v 1.3 2010/11/25 15:00:12 christos Exp $ # Type: Google KML, formerly Keyhole Markup Language # Future development of this format has been handed # over to the Open Geospatial Consortium. # http://www.opengeospatial.org/standards/kml/ # From: Asbjoern Sloth Toennesen -0 string \20 search/400 \ xmlns= >>&0 regex ['"]http://earth.google.com/kml Google KML document !:mime application/vnd.google-earth.kml+xml @@ -22,7 +22,7 @@ # From: Asbjoern Sloth Toennesen >>&0 regex ['"]http://www.opengis.net/kml OpenGIS KML document !:mime application/vnd.google-earth.kml+xml ->>>&1 string 2.2 \b, version 2.2 +>>>&1 string/t 2.2 \b, version 2.2 #------------------------------------------------------------------------------ # Type: Google KML Archive (ZIP based) diff --git a/contrib/file/magic/Magdir/linux b/contrib/file/magic/Magdir/linux index ce337f18cb..a8ddd0e989 100644 --- a/contrib/file/magic/Magdir/linux +++ b/contrib/file/magic/Magdir/linux @@ -1,6 +1,6 @@ #------------------------------------------------------------------------------ -# $File: linux,v 1.33 2009/10/23 16:44:59 christos Exp $ +# $File: linux,v 1.37 2010/12/17 14:19:51 christos Exp $ # linux: file(1) magic for Linux files # # Values for Linux/i386 binaries, from Daniel Quinlan @@ -45,24 +45,46 @@ # this can be overridden by the DOS executable (COM) entry 2 string LILO Linux/i386 LILO boot/chain loader # +# Linux make config build file, from Ole Aamot +28 string make\ config Linux make config build file +# # PSF fonts, from H. Peter Anvin -0 leshort 0x0436 Linux/i386 PC Screen Font data, ->2 byte 0 256 characters, no directory, ->2 byte 1 512 characters, no directory, ->2 byte 2 256 characters, Unicode directory, ->2 byte 3 512 characters, Unicode directory, +# Updated by Adam Buchbinder +# See: http://www.win.tue.nl/~aeb/linux/kbd/font-formats-1.html +0 leshort 0x0436 Linux/i386 PC Screen Font v1 data, +>2 byte&0x01 0 256 characters, +>2 byte&0x01 !0 512 characters, +>2 byte&0x02 0 no directory, +>2 byte&0x02 !0 Unicode directory, >3 byte >0 8x%d +0 string \x72\xb5\x4a\x86\x00\x00 Linux/i386 PC Screen Font v2 data, +>16 lelong x %d characters, +>12 lelong&0x01 0 no directory, +>12 lelong&0x01 !0 Unicode directory, +>24 lelong x %d +>28 lelong x \bx%d + # Linux swap file, from Daniel Quinlan 4086 string SWAP-SPACE Linux/i386 swap file # From: Jeff Bailey # Linux swap file with swsusp1 image, from Jeff Bailey 4076 string SWAPSPACE2S1SUSPEND Linux/i386 swap file (new style) with SWSUSP1 image +# From: James Hunt +4076 string SWAPSPACE2LINHIB0001 Linux/i386 swap file (new style) (compressed hibernate) # according to man page of mkswap (8) March 1999 -4086 string SWAPSPACE2 Linux/i386 swap file (new style) ->0x400 long x %d (4K pages) ->0x404 long x size %d pages ->>4086 string SWAPSPACE2 ->>>1052 string >\0 Label %s +# volume label and UUID Russell Coker +# http://etbe.coker.com.au/2008/07/08/label-vs-uuid-vs-device/ +4086 string SWAPSPACE2 Linux/i386 swap file (new style), +>0x400 long x version %d (4K pages), +>0x404 long x size %d pages, +>1052 string \0 no label, +>1052 string >\0 LABEL=%s, +>0x40c belong x UUID=%08x +>0x410 beshort x \b-%04x +>0x412 beshort x \b-%04x +>0x414 beshort x \b-%04x +>0x416 belong x \b-%08x +>0x41a beshort x \b%04x # From Daniel Novotny # swap file for PowerPC 65526 string SWAPSPACE2 Linux/ppc swap file @@ -267,3 +289,9 @@ >20 search/256 (name >>&1 string x (name %s) +# Type: Xen, the virtual machine monitor +# From: Radek Vokal +0 string LinuxGuestRecord Xen saved domain +#>2 regex \(name\ [^)]*\) %s +>20 search/256 (name (name +>>&1 string x %s...) diff --git a/contrib/file/magic/Magdir/llvm b/contrib/file/magic/Magdir/llvm index 058fe4e106..44a4009403 100644 --- a/contrib/file/magic/Magdir/llvm +++ b/contrib/file/magic/Magdir/llvm @@ -1,11 +1,13 @@ #------------------------------------------------------------------------------ -# $File: llvm,v 1.4 2009/09/19 16:28:10 christos Exp $ +# $File: llvm,v 1.5 2010/09/20 18:55:20 rrt Exp $ # llvm: file(1) magic for LLVM byte-codes -# URL: http://llvm.cs.uiuc.edu/docs/BytecodeFormat.html#signature +# URL: http://llvm.org/docs/BitCodeFormat.html # From: Al Stone 0 string llvm LLVM byte-codes, uncompressed 0 string llvc0 LLVM byte-codes, null compression 0 string llvc1 LLVM byte-codes, gzip compression 0 string llvc2 LLVM byte-codes, bzip2 compression +0 string \xde\xc0\x17\x0b LLVM bitcode, wrapper +0 string BC\xc0\xde LLVM bitcode diff --git a/contrib/file/magic/Magdir/macintosh b/contrib/file/magic/Magdir/macintosh index 42c8e551f0..b8ee46321d 100644 --- a/contrib/file/magic/Magdir/macintosh +++ b/contrib/file/magic/Magdir/macintosh @@ -1,6 +1,6 @@ #------------------------------------------------------------------------------ -# $File: macintosh,v 1.20 2009/09/19 16:28:10 christos Exp $ +# $File: macintosh,v 1.21 2010/09/20 19:19:17 rrt Exp $ # macintosh description # # BinHex is the Macintosh ASCII-encoded file format (see also "apple") @@ -376,3 +376,15 @@ # From: Remi Mommsen 0 string BOMStore Mac OS X bill of materials (BOM) file + +# From: Adam Buchbinder +# URL: http://en.wikipedia.org/wiki/Datafork_TrueType +# Derived from the 'fondu' and 'ufond' source code (fondu.sf.net). 'sfnt' is +# TrueType; 'POST' is PostScript. 'FONT' and 'NFNT' sometimes appear, but I +# don't know what they mean. +0 belong 0x100 +>(0x4.L+24) beshort x +>>&4 belong 0x73666e74 Mac OSX datafork font, TrueType +>>&4 belong 0x464f4e54 Mac OSX datafork font, 'FONT' +>>&4 belong 0x4e464e54 Mac OSX datafork font, 'NFNT' +>>&4 belong 0x504f5354 Mac OSX datafork font, PostScript diff --git a/contrib/file/magic/Magdir/magic b/contrib/file/magic/Magdir/magic index ba56caac74..0de332aa3b 100644 --- a/contrib/file/magic/Magdir/magic +++ b/contrib/file/magic/Magdir/magic @@ -1,9 +1,9 @@ #------------------------------------------------------------------------------ -# $File: magic,v 1.9 2009/09/19 16:28:10 christos Exp $ +# $File: magic,v 1.10 2010/11/25 15:00:12 christos Exp $ # magic: file(1) magic for magic files # -0 string #\ Magic magic text file for file(1) cmd +0 string/t #\ Magic magic text file for file(1) cmd 0 lelong 0xF11E041C magic binary file for file(1) cmd >4 lelong x (version %d) (little endian) 0 belong 0xF11E041C magic binary file for file(1) cmd diff --git a/contrib/file/magic/Magdir/mail.news b/contrib/file/magic/Magdir/mail.news index 56a5250d99..98ecb4a918 100644 --- a/contrib/file/magic/Magdir/mail.news +++ b/contrib/file/magic/Magdir/mail.news @@ -1,36 +1,36 @@ #------------------------------------------------------------------------------ -# $File: mail.news,v 1.17 2009/09/19 16:28:10 christos Exp $ +# $File: mail.news,v 1.18 2010/11/25 15:00:12 christos Exp $ # mail.news: file(1) magic for mail and news # # Unfortunately, saved netnews also has From line added in some news software. #0 string From mail text # There are tests to ascmagic.c to cope with mail and news. -0 string Relay-Version: old news text +0 string/t Relay-Version: old news text !:mime message/rfc822 -0 string #!\ rnews batched news text +0 string/t #!\ rnews batched news text !:mime message/rfc822 -0 string N#!\ rnews mailed, batched news text +0 string/t N#!\ rnews mailed, batched news text !:mime message/rfc822 -0 string Forward\ to mail forwarding text +0 string/t Forward\ to mail forwarding text !:mime message/rfc822 -0 string Pipe\ to mail piping text +0 string/t Pipe\ to mail piping text !:mime message/rfc822 -0 string Return-Path: smtp mail text +0 string/t Return-Path: smtp mail text !:mime message/rfc822 -0 string Path: news text +0 string/t Path: news text !:mime message/news -0 string Xref: news text +0 string/t Xref: news text !:mime message/news -0 string From: news or mail text +0 string/t From: news or mail text !:mime message/rfc822 -0 string Article saved news text +0 string/t Article saved news text !:mime message/news -0 string BABYL Emacs RMAIL text -0 string Received: RFC 822 mail text +0 string/t BABYL Emacs RMAIL text +0 string/t Received: RFC 822 mail text !:mime message/rfc822 -0 string MIME-Version: MIME entity text -#0 string Content- MIME entity text +0 string/t MIME-Version: MIME entity text +#0 string/t Content- MIME entity text # TNEF files... 0 lelong 0x223E9F78 Transport Neutral Encapsulation Format diff --git a/contrib/file/magic/Magdir/matroska b/contrib/file/magic/Magdir/matroska index 0ede715471..62299d21f9 100644 --- a/contrib/file/magic/Magdir/matroska +++ b/contrib/file/magic/Magdir/matroska @@ -1,6 +1,6 @@ #------------------------------------------------------------------------------ -# $File: matroska,v 1.5 2009/09/27 19:02:12 christos Exp $ +# $File: matroska,v 1.6 2010/09/20 21:11:35 rrt Exp $ # matroska: file(1) magic for Matroska files # # See http://www.matroska.org/ @@ -13,3 +13,11 @@ # DocType contents: >>8 string matroska Matroska data !:mime video/x-matroska + +# EBML id: +0 belong 0x1a45dfa3 +# DocType id: +>0 search/4096 \x42\x82 +# DocType contents: +>>&1 string webm WebM +!:mime video/webm diff --git a/contrib/file/magic/Magdir/mime b/contrib/file/magic/Magdir/mime index d1740e89af..42ca52dc6b 100644 --- a/contrib/file/magic/Magdir/mime +++ b/contrib/file/magic/Magdir/mime @@ -1,9 +1,9 @@ #------------------------------------------------------------------------------ -# $File: mime,v 1.5 2009/09/19 16:28:10 christos Exp $ +# $File: mime,v 1.6 2010/11/25 15:00:12 christos Exp $ # mime: file(1) magic for MIME encoded files # -0 string Content-Type:\ +0 string/t Content-Type:\ >14 string >\0 %s -0 string Content-Type: +0 string/t Content-Type: >13 string >\0 %s diff --git a/contrib/file/magic/Magdir/mips b/contrib/file/magic/Magdir/mips index 6ed7c9ab77..b0c496edbf 100644 --- a/contrib/file/magic/Magdir/mips +++ b/contrib/file/magic/Magdir/mips @@ -1,6 +1,6 @@ #------------------------------------------------------------------------------ -# $File: mips,v 1.5 2009/09/19 16:28:10 christos Exp $ +# $File: mips,v 1.6 2010/08/13 16:12:30 christos Exp $ # mips: file(1) magic for Silicon Graphics (MIPS, IRIS, IRIX, etc.) # Dec Ultrix (MIPS) # all of SGI's *current* machines and OSes run in big-endian mode on the @@ -171,8 +171,12 @@ # GLF is OpenGL stream encoding 0 string glfHeadMagic(); GLF_TEXT 4 belong 0x7d000000 GLF_BINARY_LSB_FIRST +!:strength -30 4 belong 0x0000007d GLF_BINARY_MSB_FIRST +!:strength -30 # GLS is OpenGL stream encoding; GLS is the successor of GLF 0 string glsBeginGLS( GLS_TEXT 4 belong 0x10000000 GLS_BINARY_LSB_FIRST +!:strength -30 4 belong 0x00000010 GLS_BINARY_MSB_FIRST +!:strength -30 diff --git a/contrib/file/magic/Magdir/misctools b/contrib/file/magic/Magdir/misctools index 03fcfc0f04..394706564e 100644 --- a/contrib/file/magic/Magdir/misctools +++ b/contrib/file/magic/Magdir/misctools @@ -1,10 +1,11 @@ #----------------------------------------------------------------------------- -# $File: misctools,v 1.10 2009/09/19 16:28:10 christos Exp $ +# $File: misctools,v 1.12 2010/09/29 18:36:49 rrt Exp $ # misctools: file(1) magic for miscellaneous UNIX tools. # 0 search/1 %%!! X-Post-It-Note text 0 string/c BEGIN:VCALENDAR vCalendar calendar file +!:mime text/calendar 0 string/c BEGIN:VCARD vCard visiting card !:mime text/x-vcard @@ -12,6 +13,12 @@ 4 string gtktalog GNOME Catalogue (gtktalog) >13 string >\0 version %s +# Summary: GStreamer binary registry +# Extension: .bin +# Submitted by: Josh Triplett +0 belong 0xc0def00d GStreamer binary registry +>4 string x \b, version %s + # Summary: Libtool library file # Extension: .la # Submitted by: Tomasz Trojanowski @@ -21,3 +28,6 @@ # Extension: .lo # Submitted by: Abel Cheung 0 search/80 .lo\ -\ a\ libtool\ object\ file libtool object file + +# From: Daniel Novotny +0 string MDMP\x93\xA7 MDMP crash report data diff --git a/contrib/file/magic/Magdir/modem b/contrib/file/magic/Magdir/modem index 6d2540a3ac..84bdb28776 100644 --- a/contrib/file/magic/Magdir/modem +++ b/contrib/file/magic/Magdir/modem @@ -1,12 +1,12 @@ #------------------------------------------------------------------------------ -# $File: modem,v 1.4 2009/09/19 16:28:10 christos Exp $ +# $File: modem,v 1.5 2010/09/20 18:55:20 rrt Exp $ # modem: file(1) magic for modem programs # # From: Florian La Roche -4 string Research, Digifax-G3-File ->29 byte 1 , fine resolution ->29 byte 0 , normal resolution +1 string PC\ Research,\ Inc Digifax-G3-File +>29 byte 1 \b, fine resolution +>29 byte 0 \b, normal resolution 0 short 0x0100 raw G3 data, byte-padded 0 short 0x1400 raw G3 data diff --git a/contrib/file/magic/Magdir/msdos b/contrib/file/magic/Magdir/msdos index 223724e6d2..456166e840 100644 --- a/contrib/file/magic/Magdir/msdos +++ b/contrib/file/magic/Magdir/msdos @@ -1,12 +1,12 @@ #------------------------------------------------------------------------------ -# $File: msdos,v 1.65 2009/09/19 16:28:11 christos Exp $ +# $File: msdos,v 1.71 2011/01/10 14:01:10 christos Exp $ # msdos: file(1) magic for MS-DOS files # # .BAT files (Daniel Quinlan, quinlan@yggdrasil.com) # updated by Joerg Jenderek at Oct 2008 -0 string @ +0 string/t @ >1 string/cW \ echo\ off DOS batch file text !:mime text/x-msdos-batch >1 string/cW echo\ off DOS batch file text @@ -19,8 +19,10 @@ # OS/2 batch files are REXX. the second regex is a bit generic, oh well # the matched commands seem to be common in REXX and uncommon elsewhere -100 regex/c =^[\ \t]{0,10}call[\ \t]{1,10}rxfunc OS/2 REXX batch file text -100 regex/c =^[\ \t]{0,10}say\ ['"] OS/2 REXX batch file text +100 search/0xffff rxfuncadd +>100 regex/c =^[\ \t]{0,10}call[\ \t]{1,10}rxfunc OS/2 REXX batch file text +100 search/0xffff say +>100 regex/c =^[\ \t]{0,10}say\ ['"] OS/2 REXX batch file text 0 leshort 0x14c MS Windows COFF Intel 80386 object file #>4 ledate x stamp %s @@ -35,114 +37,111 @@ 0 leshort 0x290 MS Windows COFF PA-RISC object file #>4 ledate x stamp %s -# XXX - according to Microsoft's spec, at an offset of 0x3c in a -# PE-format executable is the offset in the file of the PE header; -# unfortunately, that's a little-endian offset, and there's no way -# to specify an indirect offset with a specified byte order. -# So, for now, we assume the standard MS-DOS stub, which puts the -# PE header at 0x80 = 128. +# Tests for various EXE types. # -# Required OS version and subsystem version were 4.0 on some NT 3.51 -# executables built with Visual C++ 4.0, so it's not clear that -# they're interesting. The user version was 0.0, but there's -# probably some linker directive to set it. The linker version was -# 3.0, except for one ".exe" which had it as 4.20 (same damn linker!). -# -# many of the compressed formats were extraced from IDARC 1.23 source code +# Many of the compressed formats were extraced from IDARC 1.23 source code. # 0 string MZ !:mime application/x-dosexec ->0x18 leshort <0x40 MS-DOS executable ->0 string MZ\0\0\0\0\0\0\0\0\0\0PE\0\0 \b, PE for MS Windows ->>&18 leshort&0x2000 >0 (DLL) ->>&88 leshort 0 (unknown subsystem) ->>&88 leshort 1 (native) ->>&88 leshort 2 (GUI) ->>&88 leshort 3 (console) ->>&88 leshort 7 (POSIX) ->>&0 leshort 0x0 unknown processor ->>&0 leshort 0x14c Intel 80386 ->>&0 leshort 0x166 MIPS R4000 ->>&0 leshort 0x184 Alpha ->>&0 leshort 0x268 Motorola 68000 ->>&0 leshort 0x1f0 PowerPC ->>&0 leshort 0x290 PA-RISC ->>&18 leshort&0x0100 >0 32-bit ->>&18 leshort&0x1000 >0 system file ->>&0xf4 search/0x140 \x0\x40\x1\x0 ->>>(&0.l+(4)) string MSCF \b, WinHKI CAB self-extracting archive ->30 string Copyright\ 1989-1990\ PKWARE\ Inc. Self-extracting PKZIP archive -!:mime application/zip -# Is next line correct? One might expect "Corp." not "Copr." If it is right, add a note to that effect. ->30 string PKLITE\ Copr. Self-extracting PKZIP archive -!:mime application/zip - +# All non-DOS EXE extensions have the relocation table more than 0x40 bytes into the file. +>0x18 leshort <0x40 MS-DOS executable +# These traditional tests usually work but not always. When test quality support is +# implemented these can be turned on. +#>>0x18 leshort 0x1c (Borland compiler) +#>>0x18 leshort 0x1e (MS compiler) + +# If the relocation table is 0x40 or more bytes into the file, it's definitely +# not a DOS EXE. >0x18 leshort >0x3f + +# Maybe it's a PE? >>(0x3c.l) string PE\0\0 PE ->>>(0x3c.l+25) byte 1 \b32 executable ->>>(0x3c.l+25) byte 2 \b32+ executable -# hooray, there's a DOS extender using the PE format, with a valid PE -# executable inside (which just prints a message and exits if run in win) ->>>(0x3c.l+92) leshort <10 ->>>>(8.s*16) string 32STUB for MS-DOS, 32rtm DOS extender ->>>>(8.s*16) string !32STUB for MS Windows ->>>>>(0x3c.l+22) leshort&0x2000 >0 (DLL) ->>>>>(0x3c.l+92) leshort 0 (unknown subsystem) ->>>>>(0x3c.l+92) leshort 1 (native) ->>>>>(0x3c.l+92) leshort 2 (GUI) ->>>>>(0x3c.l+92) leshort 3 (console) ->>>>>(0x3c.l+92) leshort 7 (POSIX) +>>>(0x3c.l+24) leshort 0x010b \b32 executable +>>>(0x3c.l+24) leshort 0x020b \b32+ executable +>>>(0x3c.l+24) leshort 0x0107 ROM image +>>>(0x3c.l+24) default x Unknown PE signature +>>>>&0 leshort x 0x%x +>>>(0x3c.l+22) leshort&0x2000 >0 (DLL) +>>>(0x3c.l+92) leshort 1 (native) +>>>(0x3c.l+92) leshort 2 (GUI) +>>>(0x3c.l+92) leshort 3 (console) +>>>(0x3c.l+92) leshort 7 (POSIX) +>>>(0x3c.l+92) leshort 9 (Windows CE) >>>(0x3c.l+92) leshort 10 (EFI application) >>>(0x3c.l+92) leshort 11 (EFI boot service driver) >>>(0x3c.l+92) leshort 12 (EFI runtime driver) ->>>(0x3c.l+92) leshort 13 (XBOX) ->>>(0x3c.l+4) leshort 0x0 unknown processor +>>>(0x3c.l+92) leshort 13 (EFI ROM) +>>>(0x3c.l+92) leshort 14 (XBOX) +>>>(0x3c.l+92) leshort 15 (Windows boot application) +>>>(0x3c.l+92) default x (Unknown subsystem +>>>>&0 leshort x 0x%x) >>>(0x3c.l+4) leshort 0x14c Intel 80386 >>>(0x3c.l+4) leshort 0x166 MIPS R4000 +>>>(0x3c.l+4) leshort 0x168 MIPS R10000 >>>(0x3c.l+4) leshort 0x184 Alpha ->>>(0x3c.l+4) leshort 0x268 Motorola 68000 +>>>(0x3c.l+4) leshort 0x1a2 Hitachi SH3 +>>>(0x3c.l+4) leshort 0x1a6 Hitachi SH4 +>>>(0x3c.l+4) leshort 0x1c0 ARM +>>>(0x3c.l+4) leshort 0x1c2 ARM Thumb >>>(0x3c.l+4) leshort 0x1f0 PowerPC ->>>(0x3c.l+4) leshort 0x290 PA-RISC >>>(0x3c.l+4) leshort 0x200 Intel Itanium ->>>(0x3c.l+22) leshort&0x0100 >0 32-bit +>>>(0x3c.l+4) leshort 0x266 MIPS16 +>>>(0x3c.l+4) leshort 0x268 Motorola 68000 +>>>(0x3c.l+4) leshort 0x290 PA-RISC +>>>(0x3c.l+4) leshort 0x366 MIPSIV +>>>(0x3c.l+4) leshort 0x466 MIPS16 with FPU +>>>(0x3c.l+4) leshort 0xebc EFI byte code +>>>(0x3c.l+4) leshort 0x8664 x86-64 +>>>(0x3c.l+4) leshort 0xc0ee MSIL +>>>(0x3c.l+4) default x Unknown processor type +>>>>&0 leshort x 0x%x +>>>(0x3c.l+22) leshort&0x0200 >0 (stripped to external PDB) >>>(0x3c.l+22) leshort&0x1000 >0 system file ->>>(0x3c.l+232) lelong >0 Mono/.Net assembly - ->>>>(0x3c.l+0xf8) string UPX0 \b, UPX compressed ->>>>(0x3c.l+0xf8) search/0x140 PEC2 \b, PECompact2 compressed ->>>>(0x3c.l+0xf8) search/0x140 UPX2 ->>>>>(&0x10.l+(-4)) string PK\3\4 \b, ZIP self-extracting archive (Info-Zip) ->>>>(0x3c.l+0xf8) search/0x140 .idata ->>>>>(&0xe.l+(-4)) string PK\3\4 \b, ZIP self-extracting archive (Info-Zip) ->>>>>(&0xe.l+(-4)) string ZZ0 \b, ZZip self-extracting archive ->>>>>(&0xe.l+(-4)) string ZZ1 \b, ZZip self-extracting archive ->>>>(0x3c.l+0xf8) search/0x140 .rsrc ->>>>>(&0x0f.l+(-4)) string a\\\4\5 \b, WinHKI self-extracting archive ->>>>>(&0x0f.l+(-4)) string Rar! \b, RAR self-extracting archive ->>>>>(&0x0f.l+(-4)) search/0x3000 MSCF \b, InstallShield self-extracting archive ->>>>>(&0x0f.l+(-4)) search/32 Nullsoft \b, Nullsoft Installer self-extracting archive ->>>>(0x3c.l+0xf8) search/0x140 .data ->>>>>(&0x0f.l) string WEXTRACT \b, MS CAB-Installer self-extracting archive ->>>>(0x3c.l+0xf8) search/0x140 .petite\0 \b, Petite compressed ->>>>>(0x3c.l+0xf7) byte x ->>>>>>(&0x104.l+(-4)) string =!sfx! \b, ACE self-extracting archive ->>>>(0x3c.l+0xf8) search/0x140 .WISE \b, WISE installer self-extracting archive ->>>>(0x3c.l+0xf8) search/0x140 .dz\0\0\0 \b, Dzip self-extracting archive ->>>>(0x3c.l+0xf8) search/0x140 .reloc ->>>>>(&0xe.l+(-4)) search/0x180 PK\3\4 \b, ZIP self-extracting archive (WinZip) - ->>>>&(0x3c.l+0xf8) search/0x100 _winzip_ \b, ZIP self-extracting archive (WinZip) ->>>>&(0x3c.l+0xf8) search/0x100 SharedD \b, Microsoft Installer self-extracting archive ->>>>0x30 string Inno \b, InnoSetup self-extracting archive +>>>(0x3c.l+24) leshort 0x010b +>>>>(0x3c.l+232) lelong >0 Mono/.Net assembly +>>>(0x3c.l+24) leshort 0x020b +>>>>(0x3c.l+248) lelong >0 Mono/.Net assembly +# hooray, there's a DOS extender using the PE format, with a valid PE +# executable inside (which just prints a message and exits if run in win) +>>>(8.s*16) string 32STUB \b, 32rtm DOS extender +>>>(8.s*16) string !32STUB \b, for MS Windows +>>>(0x3c.l+0xf8) string UPX0 \b, UPX compressed +>>>(0x3c.l+0xf8) search/0x140 PEC2 \b, PECompact2 compressed +>>>(0x3c.l+0xf8) search/0x140 UPX2 +>>>>(&0x10.l+(-4)) string PK\3\4 \b, ZIP self-extracting archive (Info-Zip) +>>>(0x3c.l+0xf8) search/0x140 .idata +>>>>(&0xe.l+(-4)) string PK\3\4 \b, ZIP self-extracting archive (Info-Zip) +>>>>(&0xe.l+(-4)) string ZZ0 \b, ZZip self-extracting archive +>>>>(&0xe.l+(-4)) string ZZ1 \b, ZZip self-extracting archive +>>>(0x3c.l+0xf8) search/0x140 .rsrc +>>>>(&0x0f.l+(-4)) string a\\\4\5 \b, WinHKI self-extracting archive +>>>>(&0x0f.l+(-4)) string Rar! \b, RAR self-extracting archive +>>>>(&0x0f.l+(-4)) search/0x3000 MSCF \b, InstallShield self-extracting archive +>>>>(&0x0f.l+(-4)) search/32 Nullsoft \b, Nullsoft Installer self-extracting archive +>>>(0x3c.l+0xf8) search/0x140 .data +>>>>(&0x0f.l) string WEXTRACT \b, MS CAB-Installer self-extracting archive +>>>(0x3c.l+0xf8) search/0x140 .petite\0 \b, Petite compressed +>>>>(0x3c.l+0xf7) byte x +>>>>>(&0x104.l+(-4)) string =!sfx! \b, ACE self-extracting archive +>>>(0x3c.l+0xf8) search/0x140 .WISE \b, WISE installer self-extracting archive +>>>(0x3c.l+0xf8) search/0x140 .dz\0\0\0 \b, Dzip self-extracting archive +>>>&(0x3c.l+0xf8) search/0x100 _winzip_ \b, ZIP self-extracting archive (WinZip) +>>>&(0x3c.l+0xf8) search/0x100 SharedD \b, Microsoft Installer self-extracting archive +>>>0x30 string Inno \b, InnoSetup self-extracting archive + +# Hmm, not a PE but the relocation table is too high for a traditional DOS exe, +# must be one of the unusual subformats. >>(0x3c.l) string !PE\0\0 MS-DOS executable >>(0x3c.l) string NE \b, NE ->>>(0x3c.l+0x36) byte 0 (unknown OS) >>>(0x3c.l+0x36) byte 1 for OS/2 1.x >>>(0x3c.l+0x36) byte 2 for MS Windows 3.x >>>(0x3c.l+0x36) byte 3 for MS-DOS ->>>(0x3c.l+0x36) byte >3 (unknown OS) +>>>(0x3c.l+0x36) byte 4 for Windows 386 +>>>(0x3c.l+0x36) byte 5 for Borland Operating System Services +>>>(0x3c.l+0x36) default x +>>>>(0x3c.l+0x36) byte x (unknown OS %x) >>>(0x3c.l+0x36) byte 0x81 for MS-DOS, Phar Lap DOS extender >>>(0x3c.l+0x0c) leshort&0x8003 0x8002 (DLL) >>>(0x3c.l+0x0c) leshort&0x8003 0x8001 (driver) @@ -226,16 +225,32 @@ >(8.s*16) string $WdX \b, WDos/X DOS extender -# .EXE formats (Greg Roelofs, newt@uchicago.edu) +# By now an executable type should have been printed out. The executable +# may be a self-uncompressing archive, so look for evidence of that and +# print it out. +# +# Some signatures below from Greg Roelofs, newt@uchicago.edu. # >0x35 string \x8e\xc0\xb9\x08\x00\xf3\xa5\x4a\x75\xeb\x8e\xc3\x8e\xd8\x33\xff\xbe\x30\x00\x05 \b, aPack compressed ->0xe7 string LH/2\ Self-Extract \b, %s ->0x1c string diet \b, diet compressed ->0x1c string LZ09 \b, LZEXE v0.90 compressed ->0x1c string LZ91 \b, LZEXE v0.91 compressed ->0x1c string tz \b, TinyProg compressed ->0x1e string PKLITE \b, %s compressed ->0x64 string W\ Collis\0\0 \b, Compack compressed +>0xe7 string LH/2\ Self-Extract \b, %s +>0x1c string UC2X \b, UCEXE compressed +>0x1c string WWP\ \b, WWPACK compressed +>0x1c string RJSX \b, ARJ self-extracting archive +>0x1c string diet \b, diet compressed +>0x1c string LZ09 \b, LZEXE v0.90 compressed +>0x1c string LZ91 \b, LZEXE v0.91 compressed +>0x1c string tz \b, TinyProg compressed +>0x1e string Copyright\ 1989-1990\ PKWARE\ Inc. Self-extracting PKZIP archive +!:mime application/zip +# Yes, this really is "Copr", not "Corp." +>0x1e string PKLITE\ Copr. Self-extracting PKZIP archive +!:mime application/zip +# winarj stores a message in the stub instead of the sig in the MZ header +>0x20 search/0xe0 aRJsfX \b, ARJ self-extracting archive +>0x20 string AIN +>>0x23 string 2 \b, AIN 2.x compressed +>>0x23 string <2 \b, AIN 1.x compressed +>>0x23 string >2 \b, AIN 1.x compressed >0x24 string LHa's\ SFX \b, LHa self-extracting archive !:mime application/x-lha >0x24 string LHA's\ SFX \b, LHa self-extracting archive @@ -243,18 +258,17 @@ >0x24 string \ $ARX \b, ARX self-extracting archive >0x24 string \ $LHarc \b, LHarc self-extracting archive >0x20 string SFX\ by\ LARC \b, LARC self-extracting archive +>0x40 string aPKG \b, aPackage self-extracting archive +>0x64 string W\ Collis\0\0 \b, Compack compressed +>0x7a string Windows\ self-extracting\ ZIP \b, ZIP self-extracting archive +>>&0xf4 search/0x140 \x0\x40\x1\x0 +>>>(&0.l+(4)) string MSCF \b, WinHKI CAB self-extracting archive >1638 string -lh5- \b, LHa self-extracting archive v2.13S >0x17888 string Rar! \b, RAR self-extracting archive ->0x40 string aPKG \b, aPackage self-extracting archive ->32 string AIN ->>35 string 2 \b, AIN 2.x compressed ->>35 string <2 \b, AIN 1.x compressed ->>35 string >2 \b, AIN 1.x compressed ->28 string UC2X \b, UCEXE compressed ->28 string WWP\ \b, WWPACK compressed - -# skip to the end of the exe +# Skip to the end of the EXE. This will usually work fine in the PE case +# because the MZ image is hardcoded into the toolchain and almost certainly +# won't match any of these signatures. >(4.s*512) long x >>&(2.s-517) byte x >>>&0 string PK\3\4 \b, ZIP self-extracting archive @@ -266,13 +280,8 @@ >>>&7 search/400 **ACE** \b, ACE self-extracting archive >>>&0 search/0x480 UC2SFX\ Header \b, UC2 self-extracting archive ->0x1c string RJSX \b, ARJ self-extracting archive -# winarj stores a message in the stub instead of the sig in the MZ header ->0x20 search/0xe0 aRJsfX \b, ARJ self-extracting archive - # a few unknown ZIP sfxes, no idea if they are needed or if they are # already captured by the generic patterns above ->122 string Windows\ self-extracting\ ZIP \b, ZIP self-extracting archive >(8.s*16) search/0x20 PKSFX \b, ZIP self-extracting archive (PKZIP) # TODO: how to add this? >FileSize-34 string Windows\ Self-Installing\ Executable \b, ZIP self-extracting archive # @@ -334,6 +343,13 @@ # start with assembler instructions mov eax,21cd4cffh 0 uleshort&0xc0ff 0xc0b8 >1 lelong 0x21cd4cff COM executable (32-bit COMBOOT) +# syslinux:doc/comboot.txt +# A COM32R program must start with the byte sequence B8 FE 4C CD 21 (mov +# eax,21cd4cfeh) as a magic number. +0 string \xb8\xfe\x4c\xcd\x21 COM executable (COM32R) +# start with assembler instructions mov eax,21cd4cfeh +0 uleshort&0xc0ff 0xc0b8 +>1 lelong 0x21cd4cfe COM executable (32-bit COMBOOT, relocatable) 0 string \x81\xfc >4 string \x77\x02\xcd\x20\xb9 >>36 string UPX! FREE-DOS executable (COM), UPX compressed @@ -481,7 +497,7 @@ # Windows icons (Ian Springer ) 0 string \000\000\001\000 MS Windows icon resource -!:mime image/x-ico +!:mime image/x-icon >4 byte 1 - 1 icon >4 byte >1 - %d icons >>6 byte >0 \b, %dx @@ -519,6 +535,13 @@ # Acroread or something files wrongly identified as G3 .pfm # these have the form \000 \001 any? \002 \000 \000 # or \000 \001 any? \022 \000 \000 +0 belong&0xffff00ff 0x00010012 PFM data +>4 string \000\000 +>6 string >\060 - %s + +0 belong&0xffff00ff 0x00010002 PFM data +>4 string \000\000 +>6 string >\060 - %s #0 string \000\001 pfm? #>3 string \022\000\000Copyright\ yes #>3 string \002\000\000Copyright\ yes @@ -629,43 +652,58 @@ #-------------------------------------------------------------------- # Qemu Emulator Images # Lines written by Friedrich Schwittay (f.schwittay@yousable.de) -# Made by reading sources and doing trial and error on existing -# qcow files -0 string QFI Qemu Image, Format: Qcow +# Updated by Adam Buchbinder (adam.buchbinder@gmail.com) +# Made by reading sources, reading documentation, and doing trial and error +# on existing QCOW files +0 string QFI\xFB QEMU QCOW Image # Uncomment the following line to display Magic (only used for debugging # this magic number) #>0 string x , Magic: %s -# There are currently 2 Versions: "1" and "2" -# I do not use Version 2 and therefor branch here -# but can assure: it works (tested on both versions) -# Also my Qemu 0.9.0 which uses this Version 2 refuses -# to start in its bios ->0x04 belong 2 , Version: 2 ->0x04 belong 1 , Version: 1 +# There are currently 2 Versions: "1" and "2". +# http://www.gnome.org/~markmc/qcow-image-format-version-1.html +>4 belong 1 (v1) -# Using the existence of the Backing File Offset to Branch or not +# Using the existence of the Backing File Offset to determine whether # to read Backing File Information ->>0xc belong >0 , Backing File( Offset: %lu ->>>(0xc.L) string >\0 , Path: %s - -# Didn't get the trick here how qemu stores the "Size" at this Position -# There is actually something stored but nothing makes sense -# The header in the sources talks about it -#>>>16 lelong x , Size: %lu +>>12 belong >0 \b, has backing file ( +# Note that this isn't a null-terminated string; the length is actually +# (16.L). Assuming a null-terminated string happens to work usually, but it +# may spew junk until it reaches a \0 in some cases. +>>>(12.L) string >\0 \bpath %s # Modification time of the Backing File # Really useful if you want to know if your backing # file is still usable together with this image ->>>20 bedate x , Mtime: %s ) +>>>>20 bedate >0 \b, mtime %s) +>>>>20 default x \b) -# Don't know how to calculate in Magicfiles -# Also: this Information is not reliably -# stored in image-files ->>24 lelong x , Disk Size could be: %d * 256 bytes +# Size is stored in bytes in a big-endian u64. +>>24 bequad x \b, %lld bytes -0 string QEVM QEMU's suspend to disk image +# 1 for AES encryption, 0 for none. +>>36 belong 1 \b, AES-encrypted + +# http://www.gnome.org/~markmc/qcow-image-format.html +>4 belong 2 (v2) +# Using the existence of the Backing File Offset to determine whether +# to read Backing File Information +>>8 bequad >0 \b, has backing file +# Note that this isn't a null-terminated string; the length is actually +# (16.L). Assuming a null-terminated string happens to work usually, but it +# may spew junk until it reaches a \0 in some cases. Also, since there's no +# .Q modifier, we just use the bottom four bytes as an offset. Note that if +# the file is over 4G, and the backing file path is stored after the first 4G, +# the wrong filename will be printed. (This should be (8.Q), when that syntax +# is introduced.) +>>>(12.L) string >\0 (path %s) +>>24 bequad x \b, %lld bytes +>>32 belong 1 \b, AES-encrypted + +>4 default x (unknown version) + +0 string QEVM QEMU suspend to disk image 0 string Bochs\ Virtual\ HD\ Image Bochs disk image, >32 string x type %s, @@ -714,3 +752,10 @@ 0 string ITOLITLS Microsoft Reader eBook Data >8 lelong x \b, version %u !:mime application/x-ms-reader + +# Windows CE Binary Image Data Format +# From: Dr. Jesus +0 string B000FF\n Windows Embedded CE binary image + +# Windows Imaging (WIM) Image +0 string MSWIM\000\000\000 Windows imaging (WIM) image diff --git a/contrib/file/magic/Magdir/ocaml b/contrib/file/magic/Magdir/ocaml index 7fc626ce84..3ec3100c6d 100644 --- a/contrib/file/magic/Magdir/ocaml +++ b/contrib/file/magic/Magdir/ocaml @@ -1,8 +1,8 @@ #------------------------------------------------------------------------------ -# $File: ocaml,v 1.4 2009/09/19 16:28:11 christos Exp $ +# $File: ocaml,v 1.5 2010/09/20 18:55:20 rrt Exp $ # ocaml: file(1) magic for Objective Caml files. -0 string Caml1999 Objective caml +0 string Caml1999 OCaml >8 string X exec file >8 string I interface file (.cmi) >8 string O object file (.cmo) @@ -11,4 +11,4 @@ >8 string Z native library file (.cmxa) >8 string M abstract syntax tree implementation file >8 string N abstract syntax tree interface file ->9 string >\0 (Version %3.3s). +>9 string >\0 (Version %3.3s) diff --git a/contrib/file/magic/Magdir/parrot b/contrib/file/magic/Magdir/parrot new file mode 100644 index 0000000000..24e9236da4 --- /dev/null +++ b/contrib/file/magic/Magdir/parrot @@ -0,0 +1,22 @@ +#------------------------------------------------------------------------------ +# $File: parrot,v 1.1 2010/07/08 20:18:40 christos Exp $ +# parrot: file(1) magic for Parrot Virtual Machine +# URL: http://www.lua.org/ +# From: Lubomir Rintel + +# Compiled Parrot byte code +0 string \376PBC\r\n\032\n Parrot bytecode +>64 byte x %d. +>72 byte x \b%d, +>8 byte >0 %d byte words, +>16 byte 0 little-endian, +>16 byte 1 big-endian, +>32 byte 0 IEEE-754 8 byte double floats, +>32 byte 1 x86 12 byte long double floats, +>32 byte 2 IEEE-754 16 byte long double floats, +>32 byte 3 MIPS 16 byte long double floats, +>32 byte 4 AIX 16 byte long double floats, +>32 byte 5 4-byte floats, +>40 byte x Parrot %d. +>48 byte x \b%d. +>56 byte x \b%d diff --git a/contrib/file/magic/Magdir/printer b/contrib/file/magic/Magdir/printer index d3edb6f3a0..bf6d2e897c 100644 --- a/contrib/file/magic/Magdir/printer +++ b/contrib/file/magic/Magdir/printer @@ -1,11 +1,11 @@ #------------------------------------------------------------------------------ -# $File: printer,v 1.22 2009/09/19 16:28:11 christos Exp $ +# $File: printer,v 1.23 2010/11/25 15:00:12 christos Exp $ # printer: file(1) magic for printer-formatted files # # PostScript, updated by Daniel Quinlan (quinlan@yggdrasil.com) -0 string %! PostScript document text +0 string/t %! PostScript document text !:mime application/postscript !:apple ASPSTEXT >2 string PS-Adobe- conforming diff --git a/contrib/file/magic/Magdir/psion b/contrib/file/magic/Magdir/psion deleted file mode 100644 index 7aa2d74520..0000000000 --- a/contrib/file/magic/Magdir/psion +++ /dev/null @@ -1,43 +0,0 @@ -#------------------------------------------------------------------------------ -# psion: file(1) magic for Psion handhelds data -# from: Peter Breitenlohner -# -0 lelong 0x10000037 Psion Series 5 ->4 lelong 0x10000039 font file ->4 lelong 0x1000003A printer driver ->4 lelong 0x1000003B clipboard ->4 lelong 0x10000042 multi-bitmap image ->4 lelong 0x1000006A application information file ->4 lelong 0x1000006D ->>8 lelong 0x1000007D sketch image -!:mime image/x-psion-sketch ->>8 lelong 0x1000007E voice note ->>8 lelong 0x1000007F word file ->>8 lelong 0x10000085 OPL program ->>8 lelong 0x10000088 sheet file ->>8 lelong 0x100001C4 EasyFax initialisation file ->4 lelong 0x10000073 OPO module ->4 lelong 0x10000074 OPL application ->4 lelong 0x1000008A exported multi-bitmap image - -0 lelong 0x10000041 Psion Series 5 ROM multi-bitmap image - -0 lelong 0x10000050 Psion Series 5 ->4 lelong 0x1000006D database ->4 lelong 0x100000E4 ini file - -0 lelong 0x10000079 Psion Series 5 binary: ->4 lelong 0x00000000 DLL ->4 lelong 0x10000049 comms hardware library ->4 lelong 0x1000004A comms protocol library ->4 lelong 0x1000005D OPX ->4 lelong 0x1000006C application ->4 lelong 0x1000008D DLL ->4 lelong 0x100000AC logical device driver ->4 lelong 0x100000AD physical device driver ->4 lelong 0x100000E5 file transfer protocol ->4 lelong 0x100000E5 file transfer protocol ->4 lelong 0x10000140 printer definition ->4 lelong 0x10000141 printer definition - -0 lelong 0x1000007A Psion Series 5 executable diff --git a/contrib/file/magic/Magdir/python b/contrib/file/magic/Magdir/python index 703504c3b8..8aa12736ea 100644 --- a/contrib/file/magic/Magdir/python +++ b/contrib/file/magic/Magdir/python @@ -1,12 +1,12 @@ #------------------------------------------------------------------------------ -# $File: python,v 1.12 2009/10/27 14:49:57 christos Exp $ +# $File: python,v 1.16 2010/12/31 18:15:28 christos Exp $ # python: file(1) magic for python # # From: David Necas # often the module starts with a multiline string -0 string """ a python script text executable -# MAGIC as specified in Python/import.c (1.5 to 2.6a1 and 3.1a0, assuming +0 string/t """ a python script text executable +# MAGIC as specified in Python/import.c (1.5 to 2.7a0 and 3.1a0, assuming # that Py_UnicodeFlag is off for Python 2) # 20121 ( YEAR - 1995 ) + MONTH + DAY (little endian followed by "\r\n" 0 belong 0x994e0d0a python 1.5/1.6 byte-compiled @@ -17,6 +17,7 @@ 0 belong 0x6df20d0a python 2.4 byte-compiled 0 belong 0xb3f20d0a python 2.5 byte-compiled 0 belong 0xd1f20d0a python 2.6 byte-compiled +0 belong 0x03f30d0a python 2.7 byte-compiled 0 belong 0x3b0c0d0a python 3.0 byte-compiled 0 belong 0x4f0c0d0a python 3.1 byte-compiled @@ -26,5 +27,33 @@ !:mime text/x-python 0 search/1 #!/usr/bin/env\ python Python script text executable !:mime text/x-python -0 search/1 #!\ /usr/bin/env\ ruby Python script text executable +0 search/1 #!\ /usr/bin/env\ python Python script text executable +!:mime text/x-python + + +# from module.submodule import func1, func2 +0 regex/b \^from\\s+(\\w|\\.)+\\s+import.*$ Python script text executable +!:mime text/x-python + +# def __init__ (self, ...): +0 search/4096 def\ __init__ +>&0 search/64 self Python script text executable +!:mime text/x-python + +# comments +0 search/4096 ''' +>&0 regex .*'''$ Python script text executable +!:mime text/x-python + +0 search/4096 """ +>&0 regex .*"""$ Python script text executable +!:mime text/x-python + +# try: +# except: or finally: +# block +0 search/4096 try: +>&0 regex \^\\s*except.*: Python script text executable +!:mime text/x-python +>&0 search/4096 finally: Python script text executable !:mime text/x-python diff --git a/contrib/file/magic/Magdir/revision b/contrib/file/magic/Magdir/revision index d973faf132..b337ee3b2d 100644 --- a/contrib/file/magic/Magdir/revision +++ b/contrib/file/magic/Magdir/revision @@ -1,9 +1,9 @@ #------------------------------------------------------------------------------ -# $File: revision,v 1.6 2009/09/19 16:28:12 christos Exp $ +# $File: revision,v 1.8 2010/11/25 15:00:12 christos Exp $ # file(1) magic for revision control files # From Hendrik Scholz -0 string /1\ :pserver: cvs password text file +0 string/t /1\ :pserver: cvs password text file # Conary changesets # From: Jonathan Smith @@ -13,8 +13,40 @@ # From: Josh Triplett 0 string #\ v2\ git\ bundle\n Git bundle +# Type: Git pack +# From: Adam Buchbinder +# The actual magic is 'PACK', but that clashes with Doom/Quake packs. However, +# those have a little-endian offset immediately following the magic 'PACK', +# the first byte of which is never 0, while the first byte of the Git pack +# version, since it's a tiny number stored in big-endian format, is always 0. +0 string PACK\0 Git pack +>4 belong >0 \b, version %d +>>8 belong >0 \b, %d objects + +# Type: Git pack index +# From: Adam Buchbinder +0 string \377tOc Git pack index +>4 belong =2 \b, version 2 + +# Type: Git index file +# From: Frédéric Brière +0 string DIRC Git index +>4 belong >0 \b, version %d +>>8 belong >0 \b, %d entries + # Type: Mercurial bundles # From: Seo Sanghyeon 0 string HG10 Mercurial bundle, >4 string UN uncompressed >4 string BZ bzip2 compressed + +# Type: Subversion (SVN) dumps +# From: Uwe Zeisberger +0 string SVN-fs-dump-format-version: Subversion dumpfile +>28 string >\0 (version: %s) + +# Type: Bazaar revision bundles and merge requests +# URL: http://www.bazaar-vcs.org/ +# From: Jelmer Vernooij +0 string #\ Bazaar\ revision\ bundle\ v Bazaar Bundle +0 string #\ Bazaar\ merge\ directive\ format Bazaar merge directive diff --git a/contrib/file/magic/Magdir/riff b/contrib/file/magic/Magdir/riff index 9bc3c4b45c..40100b4f5e 100644 --- a/contrib/file/magic/Magdir/riff +++ b/contrib/file/magic/Magdir/riff @@ -1,6 +1,6 @@ #------------------------------------------------------------------------------ -# $File: riff,v 1.18 2009/09/19 16:28:12 christos Exp $ +# $File: riff,v 1.20 2010/09/20 18:55:20 rrt Exp $ # riff: file(1) magic for RIFF format # See # @@ -40,12 +40,14 @@ >>20 leshort 2 \b, Microsoft ADPCM >>20 leshort 6 \b, ITU G.711 A-law >>20 leshort 7 \b, ITU G.711 mu-law +>>20 leshort 8 \b, Microsoft DTS >>20 leshort 17 \b, IMA ADPCM >>20 leshort 20 \b, ITU G.723 ADPCM (Yamaha) >>20 leshort 49 \b, GSM 6.10 >>20 leshort 64 \b, ITU G.721 ADPCM >>20 leshort 80 \b, MPEG >>20 leshort 85 \b, MPEG Layer 3 +>>20 leshort 0x2001 \b, DTS >>22 leshort =1 \b, mono >>22 leshort =2 \b, stereo >>22 leshort >2 \b, %d channels @@ -224,3 +226,31 @@ >8 string NIFF \b, Notation Interchange File Format # SoundFont 2 >8 string sfbk SoundFont/Bank + +#------------------------------------------------------------------------------ +# Sony Wave64 +# see http://www.vcs.de/fileadmin/user_upload/MBS/PDF/Whitepaper/Informations_about_Sony_Wave64.pdf +# 128 bit RIFF-GUID { 66666972-912E-11CF-A5D6-28DB04C10000 } in little-endian +0 string riff\x2E\x91\xCF\x11\xA5\xD6\x28\xDB\x04\xC1\x00\x00 Sony Wave64 RIFF data +# 128 bit + total file size (64 bits) so 24 bytes +# then WAVE-GUID { 65766177-ACF3-11D3-8CD1-00C04F8EDB8A } +>24 string wave\xF3\xAC\xD3\x11\x8C\xD1\x00\xC0\x4F\x8E\xDB\x8A \b, WAVE 64 audio +!:mime audio/x-w64 +# FMT-GUID { 20746D66-ACF3-11D3-8CD1-00C04F8EDB8A } +>>40 search/256 fmt\x20\xF3\xAC\xD3\x11\x8C\xD1\x00\xC0\x4F\x8E\xDB\x8A \b +>>>&10 leshort =1 \b, mono +>>>&10 leshort =2 \b, stereo +>>>&10 leshort >2 \b, %d channels +>>>&12 lelong >0 %d Hz + +#------------------------------------------------------------------------------ +# MBWF/RF64 +# see EBU – TECH 3306 http://tech.ebu.ch/docs/tech/tech3306-2009.pdf +0 string RF64\xff\xff\xff\xffWAVEds64 MBWF/RF64 audio +!:mime audio/x-wav +>52 search/256 fmt\x20 \b +>>&6 leshort =1 \b, mono +>>&6 leshort =2 \b, stereo +>>&6 leshort >2 \b, %d channels +>>&8 lelong >0 %d Hz + diff --git a/contrib/file/magic/Magdir/rpm b/contrib/file/magic/Magdir/rpm index 455f9c7728..c273795518 100644 --- a/contrib/file/magic/Magdir/rpm +++ b/contrib/file/magic/Magdir/rpm @@ -1,6 +1,6 @@ #------------------------------------------------------------------------------ -# $File: rpm,v 1.9 2009/11/06 13:53:52 christos Exp $ +# $File: rpm,v 1.10 2010/09/20 19:19:17 rrt Exp $ # # RPM: file(1) magic for Red Hat Packages Erik Troan (ewt@redhat.com) # @@ -36,3 +36,9 @@ >>>8 beshort 17 SuperH >>>8 beshort 18 Xtensa >>10 string x %s + +# Type: Delta RPM +# From: Daniel Novotny (dnovotny@redhat.com) +0 string drpm Delta RPM +!:mime application/x-rpm +>12 string x %s diff --git a/contrib/file/magic/Magdir/ruby b/contrib/file/magic/Magdir/ruby index 70302955f5..26630f3a82 100644 --- a/contrib/file/magic/Magdir/ruby +++ b/contrib/file/magic/Magdir/ruby @@ -1,6 +1,6 @@ #------------------------------------------------------------------------------ -# $File: ruby,v 1.3 2009/09/19 16:28:12 christos Exp $ +# $File: ruby,v 1.5 2010/07/21 16:47:17 christos Exp $ # ruby: file(1) magic for Ruby scripting language # URL: http://www.ruby-lang.org/ # From: Reuben Thomas @@ -14,3 +14,15 @@ !:mime text/x-ruby 0 search/1 #!\ /usr/bin/env\ ruby Ruby script text executable !:mime text/x-ruby + +# What looks like ruby, but does not have a shebang +# (modules and such) +# From: Lubomir Rintel +0 regex \^[\ \t]*require[\ \t]'[A-Za-z_/]+' +>0 regex include\ [A-Z]|def\ [a-z]|\ do$ +>>0 regex \^[\ \t]*end([\ \t]*[;#].*)?$ Ruby script text +!:mime text/x-ruby +0 regex \^[\ \t]*(class|module)[\ \t][A-Z] +>0 regex (modul|includ)e\ [A-Z]|def\ [a-z] +>>0 regex \^[\ \t]*end([\ \t]*[;#].*)?$ Ruby module source text +!:mime text/x-ruby diff --git a/contrib/file/magic/Magdir/scientific b/contrib/file/magic/Magdir/scientific index dcdde17456..7418f1ba54 100644 --- a/contrib/file/magic/Magdir/scientific +++ b/contrib/file/magic/Magdir/scientific @@ -1,6 +1,6 @@ #------------------------------------------------------------------------------ -# $File: scientific,v 1.6 2009/09/19 16:28:12 christos Exp $ +# $File: scientific,v 1.7 2010/09/20 19:19:17 rrt Exp $ # scientific: file(1) magic for scientific formats # # From: Joe Krahn @@ -71,3 +71,36 @@ 0 string \060\000\040\000\110\000\105\000\101\000\104\000 GEDCOM data 0 string \376\377\000\060\000\040\000\110\000\105\000\101\000\104 GEDCOM data 0 string \377\376\060\000\040\000\110\000\105\000\101\000\104\000 GEDCOM data + +# PDB: Protein Data Bank files +# Adam Buchbinder +# +# http://www.wwpdb.org/documentation/format32/sect2.html +# http://www.ch.ic.ac.uk/chemime/ +# +# The PDB file format is fixed-field, 80 columns. From the spec: +# +# COLS DATA +# 1 - 6 "HEADER" +# 11 - 50 String(40) +# 51 - 59 Date +# 63 - 66 IDcode +# +# Thus, positions 7-10, 60-62 and 67-80 are spaces. The Date must be in the +# format DD-MMM-YY, e.g., 01-JAN-70, and the IDcode consists of numbers and +# uppercase letters. However, examples have been seen without the date string, +# e.g., the example on the chemime site. +0 string HEADER\ \ \ \ +>&0 regex/1 \^.{40} +>>&0 regex/1 [0-9]{2}-[A-Z]{3}-[0-9]{2}\ {3} +>>>&0 regex/1s [A-Z0-9]{4}.{14}$ +>>>>&0 regex/1 [A-Z0-9]{4} Protein Data Bank data, ID Code %s +!:mime chemical/x-pdb +>>>>0 regex/1 [0-9]{2}-[A-Z]{3}-[0-9]{2} \b, %s + +# Type: GDSII Stream file +0 belong 0x00060002 GDSII Stream file +>4 byte 0x00 +>>5 byte x version %d.0 +>4 byte >0x00 version %d +>>5 byte x \b.%d diff --git a/contrib/file/magic/Magdir/selinux b/contrib/file/magic/Magdir/selinux new file mode 100644 index 0000000000..5f22946543 --- /dev/null +++ b/contrib/file/magic/Magdir/selinux @@ -0,0 +1,24 @@ +# Type: SE Linux policy modules *.pp reference policy +# for Fedora 5 to 9, RHEL5, and Debian Etch and Lenny. +# URL: http://doc.coker.com.au/computers/selinux-magic +# From: Russell Coker + +0 lelong 0xf97cff8f SE Linux modular policy +>4 lelong x version %d, +>8 lelong x %d sections, +>>(12.l) lelong 0xf97cff8d +>>>(12.l+27) lelong x mod version %d, +>>>(12.l+31) lelong 0 Not MLS, +>>>(12.l+31) lelong 1 MLS, +>>>(12.l+23) lelong 2 +>>>>(12.l+47) string >\0 module name %s +>>>(12.l+23) lelong 1 base + +1 string policy_module( SE Linux policy module source +2 string policy_module( SE Linux policy module source + +0 string ##\ SE Linux policy interface source + +#0 search gen_context( SE Linux policy file contexts + +#0 search gen_sens( SE Linux policy MLS constraints source diff --git a/contrib/file/magic/Magdir/sgi b/contrib/file/magic/Magdir/sgi index b1b154b1bd..2a8af1faf1 100644 --- a/contrib/file/magic/Magdir/sgi +++ b/contrib/file/magic/Magdir/sgi @@ -1,6 +1,6 @@ #------------------------------------------------------------------------------ -# $File: sgi,v 1.17 2009/09/19 16:28:12 christos Exp $ +# $File: sgi,v 1.18 2010/11/25 15:00:12 christos Exp $ # sgi: file(1) magic for Silicon Graphics applications # @@ -61,7 +61,7 @@ >11 byte x dataformat %d # Alias Maya files -0 string //Maya ASCII Alias Maya Ascii File, +0 string/t //Maya ASCII Alias Maya Ascii File, >13 string >\0 version %s 8 string MAYAFOR4 Alias Maya Binary File, >32 string >\0 version %s scene diff --git a/contrib/file/magic/Magdir/sgml b/contrib/file/magic/Magdir/sgml index 0a57375639..65cd26f7e1 100644 --- a/contrib/file/magic/Magdir/sgml +++ b/contrib/file/magic/Magdir/sgml @@ -1,8 +1,8 @@ #------------------------------------------------------------------------------ -# $File: sgml,v 1.24 2009/09/19 17:31:35 christos Exp $ +# $File: sgml,v 1.25 2010/11/25 15:00:12 christos Exp $ # Type: SVG Vectorial Graphics # From: Noel Torres -0 string \15 string >\0 >>19 search/4096 \15 string >\0 >>19 search/4096 \15 string >\0 >>19 search/4096/cWbt \15 string >\0 >>19 search/4096/cWbt \15 string >\0 >>19 search/4096/cWbt \15 search/1 >\0 %.3s document text >>23 search/1 \ + +0 regex \^%?[\ \t]*SiSU[\ \t]+insert SiSU text insert +>5 regex [0-9.]+ %s + +0 regex \^%[\ \t]+SiSU[\ \t]+master SiSU text master +>5 regex [0-9.]+ %s + +0 regex \^%?[\ \t]*SiSU[\ \t]+text SiSU text +>5 regex [0-9.]+ %s + +0 regex \^%?[\ \t]*SiSU[\ \t][0-9.]+ SiSU text +>5 regex [0-9.]+ %s + +0 regex \^%*[\ \t]*sisu-[0-9.]+ SiSU text +>5 regex [0-9.]+ %s diff --git a/contrib/file/magic/Magdir/spectrum b/contrib/file/magic/Magdir/spectrum index e25ebf228f..d2c414b967 100644 --- a/contrib/file/magic/Magdir/spectrum +++ b/contrib/file/magic/Magdir/spectrum @@ -1,6 +1,6 @@ #------------------------------------------------------------------------------ -# $File: spectrum,v 1.6 2009/09/19 16:28:12 christos Exp $ +# $File: spectrum,v 1.7 2010/09/20 18:55:20 rrt Exp $ # spectrum: file(1) magic for Spectrum emulator files. # # John Elliott @@ -20,13 +20,17 @@ # Tape file. This assumes the .TAP starts with a Spectrum-format header, # which nearly all will. # -0 string \023\000\000 Spectrum .TAP data ->4 string x "%-10.10s" ->3 byte 0 - BASIC program ->3 byte 1 - number array ->3 byte 2 - character array ->3 byte 3 - memory block ->>14 belong 0x001B0040 (screen) +# Update: Sanity-check string contents to be printable. +# -Adam Buchbinder +# +0 string \023\000\000 +>4 string >\0 +>>4 string <\177 Spectrum .TAP data "%-10.10s" +>>>3 byte 0 - BASIC program +>>>3 byte 1 - number array +>>>3 byte 2 - character array +>>>3 byte 3 - memory block +>>>>14 belong 0x001B0040 (screen) # The following three blocks are from pak21-spectrum@srcf.ucam.org # TZX tape images diff --git a/contrib/file/magic/Magdir/ssh b/contrib/file/magic/Magdir/ssh new file mode 100644 index 0000000000..c87f388303 --- /dev/null +++ b/contrib/file/magic/Magdir/ssh @@ -0,0 +1,8 @@ +# Type: OpenSSH key files +# From: Nicolas Collignon + +0 string SSH\ PRIVATE\ KEY OpenSSH RSA1 private key, +>28 string >\0 version %s + +0 string ssh-dss\ OpenSSH DSA public key +0 string ssh-rsa\ OpenSSH RSA public key diff --git a/contrib/file/magic/Magdir/ssl b/contrib/file/magic/Magdir/ssl new file mode 100644 index 0000000000..4d8706e2b8 --- /dev/null +++ b/contrib/file/magic/Magdir/ssl @@ -0,0 +1,7 @@ +# Type: OpenSSL certificates/key files +# From: Nicolas Collignon + +0 string -----BEGIN\ CERTIFICATE----- PEM certificate +0 string -----BEGIN\ CERTIFICATE\ REQ PEM certificate request +0 string -----BEGIN\ RSA\ PRIVATE PEM RSA private key +0 string -----BEGIN\ DSA\ PRIVATE PEM DSA private key diff --git a/contrib/file/magic/Magdir/sun b/contrib/file/magic/Magdir/sun index 271193812d..d61723c788 100644 --- a/contrib/file/magic/Magdir/sun +++ b/contrib/file/magic/Magdir/sun @@ -1,6 +1,6 @@ #------------------------------------------------------------------------------ -# $File: sun,v 1.20 2009/09/19 16:28:12 christos Exp $ +# $File: sun,v 1.21 2010/05/05 17:38:18 christos Exp $ # sun: file(1) magic for Sun machines # # Values for big-endian Sun (MC680x0, SPARC) binaries on pre-5.x @@ -112,9 +112,11 @@ >12 belong 9 (Unknown) # Microsoft ICM color profile +!:mime application/vnd.iccprofile 36 string acspMSFT Microsoft ICM Color Profile # Sun KCMS 36 string acsp Kodak Color Management System, ICC Profile +!:mime application/vnd.iccprofile #--------------------------------------------------------------------------- # The following entries have been tested by Duncan Laurie (a diff --git a/contrib/file/magic/Magdir/tcl b/contrib/file/magic/Magdir/tcl new file mode 100644 index 0000000000..223f93b58c --- /dev/null +++ b/contrib/file/magic/Magdir/tcl @@ -0,0 +1,29 @@ +#------------------------------------------------------------------------------ +# file: file(1) magic for Tcl scripting language +# URL: http://www.tcl.tk/ +# From: gustaf neumann + +# Tcl scripts +0 search/1/w #!\ /usr/bin/tcl Tcl script text executable +!:mime text/x-lua +0 search/1/w #!\ /usr/local/bin/tcl Tcl script text executable +!:mime text/x-tcl +0 search/1 #!/usr/bin/env\ tcl Tcl script text executable +!:mime text/x-tcl +0 search/1 #!\ /usr/bin/env\ tcl Tcl script text executable +!:mime text/x-tcl +0 search/1/w #!\ /usr/bin/wish Tcl/Tk script text executable +!:mime text/x-tcl +0 search/1/w #!\ /usr/local/bin/wish Tcl/Tk script text executable +!:mime text/x-tcl +0 search/1 #!/usr/bin/env\ wish Tcl/Tk script text executable +!:mime text/x-tcl +0 search/1 #!\ /usr/bin/env\ wish Tcl/Tk script text executable +!:mime text/x-tcl + +# check the first line +0 search/1 package\ req +>0 regex \^package[\ \t]+req Tcl script +# not 'p', check other lines +0 search/1 !p +>0 regex \^package[\ \t]+req Tcl script diff --git a/contrib/file/magic/Magdir/tex b/contrib/file/magic/Magdir/tex index b650497b60..6ac4489223 100644 --- a/contrib/file/magic/Magdir/tex +++ b/contrib/file/magic/Magdir/tex @@ -1,6 +1,6 @@ #------------------------------------------------------------------------------ -# $File: tex,v 1.16 2009/09/19 16:28:12 christos Exp $ +# $File: tex,v 1.17 2010/09/20 19:19:17 rrt Exp $ # tex: file(1) magic for TeX files # # XXX - needs byte-endian stuff (big-endian and little-endian DVI?) @@ -93,3 +93,5 @@ 0 search/1 %\ BibTeX\ ` BibTeX custom bibliography style text file 0 search/1 @c\ @mapfile{ TeX font aliases text file + +0 string #LyX LyX document text diff --git a/contrib/file/magic/Magdir/tgif b/contrib/file/magic/Magdir/tgif index 18c6df975d..e80b3a76cb 100644 --- a/contrib/file/magic/Magdir/tgif +++ b/contrib/file/magic/Magdir/tgif @@ -1,8 +1,7 @@ #------------------------------------------------------------------------------ -# $File: tgif,v 1.5 2009/09/19 16:28:12 christos Exp $ +# $File: tgif,v 1.7 2010/09/20 19:03:46 rrt Exp $ # file(1) magic for tgif(1) files # From Hendrik Scholz - -0 string %TGIF\ x Tgif file version %s - +0 string %TGIF\ Tgif file version +>6 string x %s diff --git a/contrib/file/magic/Magdir/unicode b/contrib/file/magic/Magdir/unicode index ef92c1ec83..f7eb5a2108 100644 --- a/contrib/file/magic/Magdir/unicode +++ b/contrib/file/magic/Magdir/unicode @@ -1,6 +1,6 @@ #------------------------------------------------------------------------------ -# $File: unicode,v 1.5 2009/09/19 16:28:13 christos Exp $ +# $File: unicode,v 1.6 2010/09/20 18:55:20 rrt Exp $ # Unicode: BOM prefixed text files - Adrian Havill # GRR: These types should be recognised in file_ascmagic so these # encodings can be treated by text patterns. @@ -11,6 +11,6 @@ 0 string +/v+ Unicode text, UTF-7 0 string +/v/ Unicode text, UTF-7 0 string \335\163\146\163 Unicode text, UTF-8-EBCDIC -0 string \376\377\000\000 Unicode text, UTF-32, big-endian +0 string \000\000\376\377 Unicode text, UTF-32, big-endian 0 string \377\376\000\000 Unicode text, UTF-32, little-endian 0 string \016\376\377 Unicode text, SCSU (Standard Compression Scheme for Unicode) diff --git a/contrib/file/magic/Magdir/varied.out b/contrib/file/magic/Magdir/varied.out index f9135fd8f3..3d8aa9219a 100644 --- a/contrib/file/magic/Magdir/varied.out +++ b/contrib/file/magic/Magdir/varied.out @@ -1,6 +1,6 @@ #------------------------------------------------------------------------------ -# $File: varied.out,v 1.21 2009/09/19 16:28:13 christos Exp $ +# $File: varied.out,v 1.22 2010/07/02 00:06:27 christos Exp $ # varied.out: file(1) magic for various USG systems # # Herewith many of the object file formats used by USG systems. @@ -28,9 +28,12 @@ 0 string gmon GNU prof performance data >4 long x - version %ld # From: Dave Pearson -# Harbour HRB files. +# Harbour HRB files. 0 string \xc0HRB Harbour HRB file ->4 short x version %d +>4 leshort x version %d +# Harbour HBV files +0 string \xc0HBV Harbour variable dump file +>4 leshort x version %d # From: Alex Beregszaszi # 0 string exec BugOS executable diff --git a/contrib/file/magic/Magdir/varied.script b/contrib/file/magic/Magdir/varied.script index ea5725e420..1f5eee58cb 100644 --- a/contrib/file/magic/Magdir/varied.script +++ b/contrib/file/magic/Magdir/varied.script @@ -1,15 +1,15 @@ #------------------------------------------------------------------------------ -# $File: varied.script,v 1.6 2009/09/19 16:28:13 christos Exp $ +# $File: varied.script,v 1.7 2010/11/25 15:00:12 christos Exp $ # varied.script: file(1) magic for various interpreter scripts -0 string #!\ / a +0 string/t #!\ / a >3 string >\0 %s script text executable -0 string #!\t/ a +0 string/t #!\t/ a >3 string >\0 %s script text executable -0 string #!/ a +0 string/t #!/ a >2 string >\0 %s script text executable -0 string #!\ script text executable +0 string/t #!\ script text executable >3 string >\0 for %s # From: arno diff --git a/contrib/file/magic/Magdir/virtual b/contrib/file/magic/Magdir/virtual new file mode 100644 index 0000000000..ba29c17cb8 --- /dev/null +++ b/contrib/file/magic/Magdir/virtual @@ -0,0 +1,17 @@ + +#------------------------------------------------------------------------------ +# $File: virtual,v 1.1 2009/12/25 16:04:30 christos Exp $ +# From: James Nobis +# Microsoft hard disk images for: +# Virtual Server +# Virtual PC +# http://technet.microsoft.com/en-us/virtualserver/bb676673.aspx +# .vhd +0 string conectix Microsoft Disk Image, Virtual Server or Virtual PC + +# Sun xVM VirtualBox Disk Image +# string <<< Sun xVM VirtualBox Disk Image >>> +# .vdi +0 string \<\<\<\ Sun\ xVM\ VirtualBox\ Disk Sun xVM VirtualBox Disk Image + + diff --git a/contrib/file/magic/Magdir/warc b/contrib/file/magic/Magdir/warc index 72a22ee6d9..f4ba079a1a 100644 --- a/contrib/file/magic/Magdir/warc +++ b/contrib/file/magic/Magdir/warc @@ -1,7 +1,16 @@ #------------------------------------------------------------------------------ -# $File: warc,v 1.2 2009/09/19 16:28:13 christos Exp $ +# $File: warc,v 1.3 2010/11/25 15:05:43 christos Exp $ # warc: file(1) magic for WARC files 0 string WARC/ WARC Archive >5 string x version %.4s +!:mime application/warc + +#------------------------------------------------------------------------------ +# Arc File Format from Internet Archive +# see http://www.archive.org/web/researcher/ArcFileFormat.php +0 string filedesc:// Internet Archive File +!:mime application/x-ia-arc +>11 search/256 \x0A \b +>>&0 ubyte >0 \b version %c diff --git a/contrib/file/magic/Magdir/wordprocessors b/contrib/file/magic/Magdir/wordprocessors index 6e35c7d0d7..0ee4723a42 100644 --- a/contrib/file/magic/Magdir/wordprocessors +++ b/contrib/file/magic/Magdir/wordprocessors @@ -1,6 +1,6 @@ #------------------------------------------------------------------------------ -# $File: wordprocessors,v 1.14 2009/09/19 16:28:13 christos Exp $ +# $File: wordprocessors,v 1.15 2010/09/20 19:19:17 rrt Exp $ # wordprocessors: file(1) magic fo word processors. # ####### PWP file format used on Smith Corona Personal Word Processors: @@ -150,3 +150,14 @@ 0 string DOC >43 byte 0x16 Just System Word Processor Ichitaro v6 !:mime application/x-ichitaro6 + +# Type: Freemind mindmap documents +# From: Jamie Thompson +0 string/w \ +0 string \next = mlist; mlist->prev = ml; + if (action == FILE_LIST) { + printf("Binary patterns:\n"); + apprentice_list(mlist, BINTEST); + printf("Text patterns:\n"); + apprentice_list(mlist, TEXTTEST); + } + return 0; #endif /* COMPILE_ONLY */ } @@ -548,6 +556,43 @@ apprentice_sort(const void *a, const void *b) return 1; } +/* + * Shows sorted patterns list in the order which is used for the matching + */ +private void +apprentice_list(struct mlist *mlist, int mode) +{ + uint32_t magindex = 0; + struct mlist *ml; + for (ml = mlist->next; ml != mlist; ml = ml->next) { + for (magindex = 0; magindex < ml->nmagic; magindex++) { + struct magic *m = &ml->magic[magindex]; + if ((m->flag & mode) != mode) { + /* Skip sub-tests */ + while (magindex + 1 < ml->nmagic && + ml->magic[magindex + 1].cont_level != 0) + ++magindex; + continue; /* Skip to next top-level test*/ + } + + /* + * Try to iterate over the tree until we find item with + * description/mimetype. + */ + while (magindex + 1 < ml->nmagic && + ml->magic[magindex + 1].cont_level != 0 && + *ml->magic[magindex].desc == '\0' && + *ml->magic[magindex].mimetype == '\0') + magindex++; + + printf("Strength = %3" SIZE_T_FORMAT "u : %s [%s]\n", + apprentice_magic_strength(m), + ml->magic[magindex].desc, + ml->magic[magindex].mimetype); + } + } +} + private void set_test_type(struct magic *mstart, struct magic *m) { @@ -587,8 +632,11 @@ set_test_type(struct magic *mstart, struct magic *m) case FILE_PSTRING: case FILE_BESTRING16: case FILE_LESTRING16: - /* binary test, set flag */ - mstart->flag |= BINTEST; + /* Allow text overrides */ + if (mstart->str_flags & STRING_TEXTTEST) + mstart->flag |= TEXTTEST; + else + mstart->flag |= BINTEST; break; case FILE_REGEX: case FILE_SEARCH: @@ -816,7 +864,8 @@ apprentice_load(struct magic_set *ms, struct magic **magicp, uint32_t *nmagicp, if (marray[i].mp->cont_level == 0) break; if (i != marraycount) { - ms->line = marray[i].mp->lineno; /* XXX - Ugh! */ + /* XXX - Ugh! */ + ms->line = marray[i].mp->lineno; file_magwarn(ms, "level 0 \"default\" did not sort last"); } @@ -932,6 +981,11 @@ string_modifier_check(struct magic_set *ms, struct magic *m) if ((ms->flags & MAGIC_CHECK) == 0) return 0; + if (m->type != FILE_PSTRING && (m->str_flags & PSTRING_LEN) != 0) { + file_magwarn(ms, + "'/BHhLl' modifiers are only allowed for pascal strings\n"); + return -1; + } switch (m->type) { case FILE_BESTRING16: case FILE_LESTRING16: @@ -1308,8 +1362,7 @@ parse(struct magic_set *ms, struct magic_entry **mentryp, uint32_t *nmentryp, ++l; } m->str_range = 0; - m->str_flags = 0; - m->num_mask = 0; + m->str_flags = m->type == FILE_PSTRING ? PSTRING_1_LE : 0; if ((op = get_op(*l)) != -1) { if (!IS_STRING(m->type)) { uint64_t val; @@ -1341,7 +1394,8 @@ parse(struct magic_set *ms, struct magic_entry **mentryp, uint32_t *nmentryp, l = t - 1; break; case CHAR_COMPACT_WHITESPACE: - m->str_flags |= STRING_COMPACT_WHITESPACE; + m->str_flags |= + STRING_COMPACT_WHITESPACE; break; case CHAR_COMPACT_OPTIONAL_WHITESPACE: m->str_flags |= @@ -1362,11 +1416,42 @@ parse(struct magic_set *ms, struct magic_entry **mentryp, uint32_t *nmentryp, case CHAR_TEXTTEST: m->str_flags |= STRING_TEXTTEST; break; + case CHAR_PSTRING_1_LE: + if (m->type != FILE_PSTRING) + goto bad; + m->str_flags = (m->str_flags & ~PSTRING_LEN) | PSTRING_1_LE; + break; + case CHAR_PSTRING_2_BE: + if (m->type != FILE_PSTRING) + goto bad; + m->str_flags = (m->str_flags & ~PSTRING_LEN) | PSTRING_2_BE; + break; + case CHAR_PSTRING_2_LE: + if (m->type != FILE_PSTRING) + goto bad; + m->str_flags = (m->str_flags & ~PSTRING_LEN) | PSTRING_2_LE; + break; + case CHAR_PSTRING_4_BE: + if (m->type != FILE_PSTRING) + goto bad; + m->str_flags = (m->str_flags & ~PSTRING_LEN) | PSTRING_4_BE; + break; + case CHAR_PSTRING_4_LE: + if (m->type != FILE_PSTRING) + goto bad; + m->str_flags = (m->str_flags & ~PSTRING_LEN) | PSTRING_4_LE; + break; + case CHAR_PSTRING_LENGTH_INCLUDES_ITSELF: + if (m->type != FILE_PSTRING) + goto bad; + m->str_flags |= PSTRING_LENGTH_INCLUDES_ITSELF; + break; + bad: default: if (ms->flags & MAGIC_CHECK) file_magwarn(ms, - "string extension `%c' invalid", - *l); + "string extension `%c' " + "invalid", *l); return -1; } /* allow multiple '/' for readability */ @@ -1533,7 +1618,8 @@ out: } /* - * Parse an Apple CREATOR/TYPE annotation from magic file and put it into magic[index - 1] + * Parse an Apple CREATOR/TYPE annotation from magic file and put it into + * magic[index - 1] */ private int parse_apple(struct magic_set *ms, struct magic_entry *me, const char *line) @@ -1543,20 +1629,21 @@ parse_apple(struct magic_set *ms, struct magic_entry *me, const char *line) struct magic *m = &me->mp[me->cont_count == 0 ? 0 : me->cont_count - 1]; if (m->apple[0] != '\0') { - file_magwarn(ms, "Current entry already has a APPLE type `%.8s'," - " new type `%s'", m->mimetype, l); + file_magwarn(ms, "Current entry already has a APPLE type " + "`%.8s', new type `%s'", m->mimetype, l); return -1; } EATAB; - for (i = 0; *l && ((isascii((unsigned char)*l) && isalnum((unsigned char)*l)) - || strchr("-+/.", *l)) && i < sizeof(m->apple); m->apple[i++] = *l++) + for (i = 0; *l && ((isascii((unsigned char)*l) && + isalnum((unsigned char)*l)) || strchr("-+/.", *l)) && + i < sizeof(m->apple); m->apple[i++] = *l++) continue; if (i == sizeof(m->apple) && *l) { /* We don't need to NUL terminate here, printing handles it */ if (ms->flags & MAGIC_CHECK) - file_magwarn(ms, "APPLE type `%s' truncated %zu", - line, i); + file_magwarn(ms, "APPLE type `%s' truncated %" + SIZE_T_FORMAT "u", line, i); } if (i > 0) @@ -1583,14 +1670,15 @@ parse_mime(struct magic_set *ms, struct magic_entry *me, const char *line) } EATAB; - for (i = 0; *l && ((isascii((unsigned char)*l) && isalnum((unsigned char)*l)) - || strchr("-+/.", *l)) && i < sizeof(m->mimetype); m->mimetype[i++] = *l++) + for (i = 0; *l && ((isascii((unsigned char)*l) && + isalnum((unsigned char)*l)) || strchr("-+/.", *l)) && + i < sizeof(m->mimetype); m->mimetype[i++] = *l++) continue; if (i == sizeof(m->mimetype)) { m->mimetype[sizeof(m->mimetype) - 1] = '\0'; if (ms->flags & MAGIC_CHECK) - file_magwarn(ms, "MIME type `%s' truncated %zu", - m->mimetype, i); + file_magwarn(ms, "MIME type `%s' truncated %" + SIZE_T_FORMAT "u", m->mimetype, i); } else m->mimetype[i] = '\0'; @@ -1879,8 +1967,10 @@ getstr(struct magic_set *ms, struct magic *m, const char *s, int warn) if (isprint((unsigned char)c)) { /* Allow escaping of * ``relations'' */ - if (strchr("<>&^=!", c) - == NULL) { + if (strchr("<>&^=!", c) == NULL + && (m->type != FILE_REGEX || + strchr("[]().*?^$|{}", c) + == NULL)) { file_magwarn(ms, "no " "need to escape " "`%c'", c); @@ -1990,7 +2080,7 @@ out: *p = '\0'; m->vallen = CAST(unsigned char, (p - origp)); if (m->type == FILE_PSTRING) - m->vallen++; + m->vallen += file_pstring_length_size(m); return s; } @@ -2371,6 +2461,8 @@ bs1(struct magic *m) m->in_offset = swap4((uint32_t)m->in_offset); m->lineno = swap4((uint32_t)m->lineno); if (IS_STRING(m->type)) { + if (m->type == FILE_PSTRING) + printf("flags! %d\n", m->str_flags); m->str_range = swap4(m->str_range); m->str_flags = swap4(m->str_flags); } @@ -2379,3 +2471,51 @@ bs1(struct magic *m) m->num_mask = swap8(m->num_mask); } } + +protected size_t +file_pstring_length_size(const struct magic *m) +{ + switch (m->str_flags & PSTRING_LEN) { + case PSTRING_1_LE: + return 1; + case PSTRING_2_LE: + case PSTRING_2_BE: + return 2; + case PSTRING_4_LE: + case PSTRING_4_BE: + return 4; + default: + abort(); /* Impossible */ + return 1; + } +} +protected size_t +file_pstring_get_length(const struct magic *m, const char *s) +{ + size_t len = 0; + + switch (m->str_flags & PSTRING_LEN) { + case PSTRING_1_LE: + len = *s; + break; + case PSTRING_2_LE: + len = (s[1] << 8) | s[0]; + break; + case PSTRING_2_BE: + len = (s[0] << 8) | s[1]; + break; + case PSTRING_4_LE: + len = (s[3] << 24) | (s[2] << 16) | (s[1] << 8) | s[0]; + break; + case PSTRING_4_BE: + len = (s[0] << 24) | (s[1] << 16) | (s[2] << 8) | s[3]; + break; + default: + abort(); /* Impossible */ + } + + if (m->str_flags & PSTRING_LENGTH_INCLUDES_ITSELF) + len -= file_pstring_length_size(m); + + return len; +} diff --git a/contrib/file/src/ascmagic.c b/contrib/file/src/ascmagic.c index 9236fb4a27..94db560f9c 100644 --- a/contrib/file/src/ascmagic.c +++ b/contrib/file/src/ascmagic.c @@ -36,7 +36,7 @@ #include "file.h" #ifndef lint -FILE_RCSID("@(#)$File: ascmagic.c,v 1.75 2009/02/03 20:27:51 christos Exp $") +FILE_RCSID("@(#)$File: ascmagic.c,v 1.77 2010/11/30 14:58:53 rrt Exp $") #endif /* lint */ #include "magic.h" @@ -93,7 +93,7 @@ file_ascmagic(struct magic_set *ms, const unsigned char *buf, size_t nbytes) goto done; } - rv = file_ascmagic_with_encoding(ms, buf, nbytes, ubuf, ulen, code, + rv = file_ascmagic_with_encoding(ms, buf, nbytes, ubuf, ulen, code, type); done: @@ -125,6 +125,7 @@ file_ascmagic_with_encoding(struct magic_set *ms, const unsigned char *buf, int n_lf = 0; int n_cr = 0; int n_nel = 0; + int score, curtype; size_t last_line_end = (size_t)-1; int has_long_lines = 0; @@ -140,27 +141,31 @@ file_ascmagic_with_encoding(struct magic_set *ms, const unsigned char *buf, goto done; } - /* Convert ubuf to UTF-8 and try text soft magic */ - /* malloc size is a conservative overestimate; could be - improved, or at least realloced after conversion. */ - mlen = ulen * 6; - if ((utf8_buf = CAST(unsigned char *, malloc(mlen))) == NULL) { - file_oomem(ms, mlen); - goto done; + if ((ms->flags & MAGIC_NO_CHECK_SOFT) == 0) { + /* Convert ubuf to UTF-8 and try text soft magic */ + /* malloc size is a conservative overestimate; could be + improved, or at least realloced after conversion. */ + mlen = ulen * 6; + if ((utf8_buf = CAST(unsigned char *, malloc(mlen))) == NULL) { + file_oomem(ms, mlen); + goto done; + } + if ((utf8_end = encode_utf8(utf8_buf, mlen, ubuf, ulen)) == NULL) + goto done; + if ((rv = file_softmagic(ms, utf8_buf, (size_t)(utf8_end - utf8_buf), + TEXTTEST)) != 0) + goto done; + else + rv = -1; } - if ((utf8_end = encode_utf8(utf8_buf, mlen, ubuf, ulen)) == NULL) - goto done; - if ((rv = file_softmagic(ms, utf8_buf, (size_t)(utf8_end - utf8_buf), - TEXTTEST)) != 0) - goto done; - else - rv = -1; /* look for tokens from names.h - this is expensive! */ if ((ms->flags & MAGIC_NO_CHECK_TOKENS) != 0) goto subtype_identified; i = 0; + score = 0; + curtype = -1; while (i < ulen) { size_t end; @@ -179,9 +184,18 @@ file_ascmagic_with_encoding(struct magic_set *ms, const unsigned char *buf, for (p = names; p < names + NNAMES; p++) { if (ascmatch((const unsigned char *)p->name, ubuf + i, end - i)) { - subtype = types[p->type].human; - subtype_mime = types[p->type].mime; - goto subtype_identified; + if (curtype == -1) + curtype = p->type; + else if (curtype != p->type) { + score = p->score; + curtype = p->type; + } else + score += p->score; + if (score > 1) { + subtype = types[p->type].human; + subtype_mime = types[p->type].mime; + goto subtype_identified; + } } } diff --git a/contrib/file/src/cdf.c b/contrib/file/src/cdf.c index 92791ea8e6..91065d77da 100644 --- a/contrib/file/src/cdf.c +++ b/contrib/file/src/cdf.c @@ -24,15 +24,18 @@ * POSSIBILITY OF SUCH DAMAGE. */ /* - * Parse composite document files, the format used in Microsoft Office - * document files before they switched to zipped xml. + * Parse Composite Document Files, the format used in Microsoft Office + * document files before they switched to zipped XML. * Info from: http://sc.openoffice.org/compdocfileformat.pdf + * + * N.B. This is the "Composite Document File" format, and not the + * "Compound Document Format", nor the "Channel Definition Format". */ #include "file.h" #ifndef lint -FILE_RCSID("@(#)$File: cdf.c,v 1.36 2010/01/22 20:56:26 christos Exp $") +FILE_RCSID("@(#)$File: cdf.c,v 1.39 2010/07/22 21:59:42 christos Exp $") #endif #include @@ -74,6 +77,19 @@ static union { #define CDF_TOLE8(x) ((uint64_t)(NEED_SWAP ? cdf_tole8(x) : (uint64_t)(x))) #define CDF_TOLE4(x) ((uint32_t)(NEED_SWAP ? cdf_tole4(x) : (uint32_t)(x))) #define CDF_TOLE2(x) ((uint16_t)(NEED_SWAP ? cdf_tole2(x) : (uint16_t)(x))) +#define CDF_GETUINT32(x, y) cdf_getuint32(x, y) + +/* + * grab a uint32_t from a possibly unaligned address, and return it in + * the native host order. + */ +static uint32_t +cdf_getuint32(const uint8_t *p, size_t offs) +{ + uint32_t rv; + (void)memcpy(&rv, p + offs * sizeof(uint32_t), sizeof(rv)); + return CDF_TOLE4(rv); +} /* * swap a short @@ -82,8 +98,8 @@ uint16_t cdf_tole2(uint16_t sv) { uint16_t rv; - uint8_t *s = (uint8_t *)(void *)&sv; - uint8_t *d = (uint8_t *)(void *)&rv; + uint8_t *s = (uint8_t *)(void *)&sv; + uint8_t *d = (uint8_t *)(void *)&rv; d[0] = s[1]; d[1] = s[0]; return rv; @@ -96,8 +112,8 @@ uint32_t cdf_tole4(uint32_t sv) { uint32_t rv; - uint8_t *s = (uint8_t *)(void *)&sv; - uint8_t *d = (uint8_t *)(void *)&rv; + uint8_t *s = (uint8_t *)(void *)&sv; + uint8_t *d = (uint8_t *)(void *)&rv; d[0] = s[3]; d[1] = s[2]; d[2] = s[1]; @@ -112,8 +128,8 @@ uint64_t cdf_tole8(uint64_t sv) { uint64_t rv; - uint8_t *s = (uint8_t *)(void *)&sv; - uint8_t *d = (uint8_t *)(void *)&rv; + uint8_t *s = (uint8_t *)(void *)&sv; + uint8_t *d = (uint8_t *)(void *)&rv; d[0] = s[7]; d[1] = s[6]; d[2] = s[5]; @@ -239,9 +255,10 @@ cdf_check_stream_offset(const cdf_stream_t *sst, const void *p, size_t tail, (void)&line; if (e >= b && (size_t)(e - b) < sst->sst_dirlen * sst->sst_len) return 0; - DPRINTF(("%d: offset begin %p end %p %zu >= %zu [%zu %zu]\n", - line, b, e, (size_t)(e - b), sst->sst_dirlen * sst->sst_len, - sst->sst_dirlen, sst->sst_len)); + DPRINTF(("%d: offset begin %p end %p %" SIZE_T_FORMAT "u" + " >= %" SIZE_T_FORMAT "u [%" SIZE_T_FORMAT "u %" + SIZE_T_FORMAT "u]\n", line, b, e, (size_t)(e - b), + sst->sst_dirlen * sst->sst_len, sst->sst_dirlen, sst->sst_len)); errno = EFTYPE; return -1; } @@ -284,7 +301,8 @@ cdf_read_header(const cdf_info_t *info, cdf_header_t *h) cdf_unpack_header(h, buf); cdf_swap_header(h); if (h->h_magic != CDF_MAGIC) { - DPRINTF(("Bad magic 0x%llx != 0x%llx\n", + DPRINTF(("Bad magic 0x%" INT64_T_FORMAT "x != 0x%" + INT64_T_FORMAT "x\n", (unsigned long long)h->h_magic, (unsigned long long)CDF_MAGIC)); goto out; @@ -342,14 +360,15 @@ cdf_read_sat(const cdf_info_t *info, cdf_header_t *h, cdf_sat_t *sat) #define CDF_SEC_LIMIT (UINT32_MAX / (4 * ss)) if (h->h_num_sectors_in_master_sat > CDF_SEC_LIMIT / nsatpersec || i > CDF_SEC_LIMIT) { - DPRINTF(("Number of sectors in master SAT too big %u %zu\n", - h->h_num_sectors_in_master_sat, i)); + DPRINTF(("Number of sectors in master SAT too big %u %" + SIZE_T_FORMAT "u\n", h->h_num_sectors_in_master_sat, i)); errno = EFTYPE; return -1; } sat->sat_len = h->h_num_sectors_in_master_sat * nsatpersec + i; - DPRINTF(("sat_len = %zu ss = %zu\n", sat->sat_len, ss)); + DPRINTF(("sat_len = %" SIZE_T_FORMAT "u ss = %" SIZE_T_FORMAT "u\n", + sat->sat_len, ss)); if ((sat->sat_tab = CAST(cdf_secid_t *, calloc(sat->sat_len, ss))) == NULL) return -1; @@ -550,7 +569,7 @@ cdf_read_dir(const cdf_info_t *info, const cdf_header_t *h, nd = ss / CDF_DIRECTORY_SIZE; dir->dir_len = ns * nd; - dir->dir_tab = CAST(cdf_directory_t *, + dir->dir_tab = CAST(cdf_directory_t *, calloc(dir->dir_len, sizeof(dir->dir_tab[0]))); if (dir->dir_tab == NULL) return -1; @@ -649,7 +668,7 @@ cdf_read_short_stream(const cdf_info_t *info, const cdf_header_t *h, if (d->d_stream_first_sector < 0) goto out; - return cdf_read_long_sector_chain(info, h, sat, + return cdf_read_long_sector_chain(info, h, sat, d->d_stream_first_sector, d->d_size, scn); out: scn->sst_tab = NULL; @@ -698,14 +717,14 @@ cdf_read_property_info(const cdf_stream_t *sst, uint32_t offs, { const cdf_section_header_t *shp; cdf_section_header_t sh; - const uint32_t *p, *q, *e; + const uint8_t *p, *q, *e; int16_t s16; int32_t s32; uint32_t u32; int64_t s64; uint64_t u64; cdf_timestamp_t tp; - size_t i, o, nelements, j; + size_t i, o, o4, nelements, j; cdf_property_info_t *inp; if (offs > UINT32_MAX / 4) { @@ -744,32 +763,33 @@ cdf_read_property_info(const cdf_stream_t *sst, uint32_t offs, *info = inp; inp += *count; *count += sh.sh_properties; - p = CAST(const uint32_t *, (const void *) + p = CAST(const uint8_t *, (const void *) ((const char *)(const void *)sst->sst_tab + offs + sizeof(sh))); - e = CAST(const uint32_t *, (const void *) + e = CAST(const uint8_t *, (const void *) (((const char *)(const void *)shp) + sh.sh_len)); if (cdf_check_stream_offset(sst, e, 0, __LINE__) == -1) goto out; for (i = 0; i < sh.sh_properties; i++) { - q = (const uint32_t *)(const void *) + q = (const uint8_t *)(const void *) ((const char *)(const void *)p + - CDF_TOLE4(p[(i << 1) + 1])) - 2; + CDF_GETUINT32(p, (i << 1) + 1)) - 2 * sizeof(uint32_t); if (q > e) { DPRINTF(("Ran of the end %p > %p\n", q, e)); goto out; } - inp[i].pi_id = CDF_TOLE4(p[i << 1]); - inp[i].pi_type = CDF_TOLE4(q[0]); - DPRINTF(("%d) id=%x type=%x offs=%x\n", i, inp[i].pi_id, - inp[i].pi_type, (const char *)q - (const char *)p)); + inp[i].pi_id = CDF_GETUINT32(p, i << 1); + inp[i].pi_type = CDF_GETUINT32(q, 0); + DPRINTF(("%d) id=%x type=%x offs=%x,%d\n", i, inp[i].pi_id, + inp[i].pi_type, q - p, CDF_GETUINT32(p, (i << 1) + 1))); if (inp[i].pi_type & CDF_VECTOR) { - nelements = CDF_TOLE4(q[1]); + nelements = CDF_GETUINT32(q, 1); o = 2; } else { nelements = 1; o = 1; } + o4 = o * sizeof(uint32_t); if (inp[i].pi_type & (CDF_ARRAY|CDF_BYREF|CDF_RESERVED)) goto unknown; switch (inp[i].pi_type & CDF_TYPEMASK) { @@ -779,32 +799,32 @@ cdf_read_property_info(const cdf_stream_t *sst, uint32_t offs, case CDF_SIGNED16: if (inp[i].pi_type & CDF_VECTOR) goto unknown; - (void)memcpy(&s16, &q[o], sizeof(s16)); + (void)memcpy(&s16, &q[o4], sizeof(s16)); inp[i].pi_s16 = CDF_TOLE2(s16); break; case CDF_SIGNED32: if (inp[i].pi_type & CDF_VECTOR) goto unknown; - (void)memcpy(&s32, &q[o], sizeof(s32)); + (void)memcpy(&s32, &q[o4], sizeof(s32)); inp[i].pi_s32 = CDF_TOLE4((uint32_t)s32); break; case CDF_BOOL: case CDF_UNSIGNED32: if (inp[i].pi_type & CDF_VECTOR) goto unknown; - (void)memcpy(&u32, &q[o], sizeof(u32)); + (void)memcpy(&u32, &q[o4], sizeof(u32)); inp[i].pi_u32 = CDF_TOLE4(u32); break; case CDF_SIGNED64: if (inp[i].pi_type & CDF_VECTOR) goto unknown; - (void)memcpy(&s64, &q[o], sizeof(s64)); + (void)memcpy(&s64, &q[o4], sizeof(s64)); inp[i].pi_s64 = CDF_TOLE8((uint64_t)s64); break; case CDF_UNSIGNED64: if (inp[i].pi_type & CDF_VECTOR) goto unknown; - (void)memcpy(&u64, &q[o], sizeof(u64)); + (void)memcpy(&u64, &q[o4], sizeof(u64)); inp[i].pi_u64 = CDF_TOLE8((uint64_t)u64); break; case CDF_LENGTH32_STRING: @@ -824,22 +844,23 @@ cdf_read_property_info(const cdf_stream_t *sst, uint32_t offs, } DPRINTF(("nelements = %d\n", nelements)); for (j = 0; j < nelements; j++, i++) { - uint32_t l = CDF_TOLE4(q[o]); + uint32_t l = CDF_GETUINT32(q, o); inp[i].pi_str.s_len = l; inp[i].pi_str.s_buf = - (const char *)(const void *)(&q[o+1]); + (const char *)(const void *)(&q[o4 + 1]); DPRINTF(("l = %d, r = %d, s = %s\n", l, CDF_ROUND(l, sizeof(l)), inp[i].pi_str.s_buf)); l = 4 + (uint32_t)CDF_ROUND(l, sizeof(l)); o += l >> 2; + o4 = o * sizeof(uint32_t); } i--; break; case CDF_FILETIME: if (inp[i].pi_type & CDF_VECTOR) goto unknown; - (void)memcpy(&tp, &q[o], sizeof(tp)); + (void)memcpy(&tp, &q[o4], sizeof(tp)); inp[i].pi_tp = CDF_TOLE8((uint64_t)tp); break; case CDF_CLIPBOARD: @@ -1015,7 +1036,8 @@ cdf_dump_sat(const char *prefix, const cdf_sat_t *sat, size_t size) size_t i, j, s = size / sizeof(cdf_secid_t); for (i = 0; i < sat->sat_len; i++) { - (void)fprintf(stderr, "%s[%zu]:\n%.6d: ", prefix, i, i * s); + (void)fprintf(stderr, "%s[%" SIZE_T_FORMAT "u]:\n%.6d: ", + prefix, i, i * s); for (j = 0; j < s; j++) { (void)fprintf(stderr, "%5d, ", CDF_TOLE4(sat->sat_tab[s * i + j])); @@ -1072,7 +1094,8 @@ cdf_dump_dir(const cdf_info_t *info, const cdf_header_t *h, d = &dir->dir_tab[i]; for (j = 0; j < sizeof(name); j++) name[j] = (char)CDF_TOLE2(d->d_name[j]); - (void)fprintf(stderr, "Directory %zu: %s\n", i, name); + (void)fprintf(stderr, "Directory %" SIZE_T_FORMAT "u: %s\n", + i, name); if (d->d_type < __arraycount(types)) (void)fprintf(stderr, "Type: %s\n", types[d->d_type]); else @@ -1107,7 +1130,7 @@ cdf_dump_dir(const cdf_info_t *info, const cdf_header_t *h, default: break; } - + } } @@ -1121,7 +1144,7 @@ cdf_dump_property_info(const cdf_property_info_t *info, size_t count) for (i = 0; i < count; i++) { cdf_print_property_name(buf, sizeof(buf), info[i].pi_id); - (void)fprintf(stderr, "%zu) %s: ", i, buf); + (void)fprintf(stderr, "%" SIZE_T_FORMAT "u) %s: ", i, buf); switch (info[i].pi_type) { case CDF_NULL: break; @@ -1143,11 +1166,11 @@ cdf_dump_property_info(const cdf_property_info_t *info, size_t count) info[i].pi_str.s_len, info[i].pi_str.s_buf); break; case CDF_LENGTH32_WSTRING: - (void)fprintf(stderr, "string %u [", + (void)fprintf(stderr, "string %u [", info[i].pi_str.s_len); for (j = 0; j < info[i].pi_str.s_len - 1; j++) (void)fputc(info[i].pi_str.s_buf[j << 1], stderr); - (void)fprintf(stderr, "]\n"); + (void)fprintf(stderr, "]\n"); break; case CDF_FILETIME: tp = info[i].pi_tp; diff --git a/contrib/file/src/cdf.h b/contrib/file/src/cdf.h index 6fa3fc6939..a1b998be38 100644 --- a/contrib/file/src/cdf.h +++ b/contrib/file/src/cdf.h @@ -24,12 +24,27 @@ * POSSIBILITY OF SUCH DAMAGE. */ /* - * Info from: http://sc.openoffice.org/compdocfileformat.pdf + * Parse Composite Document Files, the format used in Microsoft Office + * document files before they switched to zipped XML. + * Info from: http://sc.openoffice.org/compdocfileformat.pdf + * + * N.B. This is the "Composite Document File" format, and not the + * "Compound Document Format", nor the "Channel Definition Format". */ #ifndef _H_CDF_ #define _H_CDF_ +#ifdef WIN32 +#include +#define timespec timeval +#define tv_nsec tv_usec +#endif +#ifdef __DJGPP__ +#define timespec timeval +#define tv_nsec tv_usec +#endif + typedef int32_t cdf_secid_t; #define CDF_LOOP_LIMIT 10000 @@ -41,24 +56,24 @@ typedef int32_t cdf_secid_t; #define CDF_SECID_MASTER_SECTOR_ALLOCATION_TABLE -4 typedef struct { - uint64_t h_magic; + uint64_t h_magic; #define CDF_MAGIC 0xE11AB1A1E011CFD0LL - uint64_t h_uuid[2]; - uint16_t h_revision; - uint16_t h_version; - uint16_t h_byte_order; - uint16_t h_sec_size_p2; - uint16_t h_short_sec_size_p2; - uint8_t h_unused0[10]; - uint32_t h_num_sectors_in_sat; - uint32_t h_secid_first_directory; - uint8_t h_unused1[4]; - uint32_t h_min_size_standard_stream; - cdf_secid_t h_secid_first_sector_in_short_sat; - uint32_t h_num_sectors_in_short_sat; - cdf_secid_t h_secid_first_sector_in_master_sat; - uint32_t h_num_sectors_in_master_sat; - cdf_secid_t h_master_sat[436/4]; + uint64_t h_uuid[2]; + uint16_t h_revision; + uint16_t h_version; + uint16_t h_byte_order; + uint16_t h_sec_size_p2; + uint16_t h_short_sec_size_p2; + uint8_t h_unused0[10]; + uint32_t h_num_sectors_in_sat; + uint32_t h_secid_first_directory; + uint8_t h_unused1[4]; + uint32_t h_min_size_standard_stream; + cdf_secid_t h_secid_first_sector_in_short_sat; + uint32_t h_num_sectors_in_short_sat; + cdf_secid_t h_secid_first_sector_in_master_sat; + uint32_t h_num_sectors_in_master_sat; + cdf_secid_t h_master_sat[436/4]; } cdf_header_t; #define CDF_SEC_SIZE(h) (1 << (h)->h_sec_size_p2) @@ -74,92 +89,92 @@ typedef int64_t cdf_timestamp_t; #define CDF_TIME_PREC 10000000 typedef struct { - uint16_t d_name[32]; - uint16_t d_namelen; - uint8_t d_type; + uint16_t d_name[32]; + uint16_t d_namelen; + uint8_t d_type; #define CDF_DIR_TYPE_EMPTY 0 #define CDF_DIR_TYPE_USER_STORAGE 1 #define CDF_DIR_TYPE_USER_STREAM 2 #define CDF_DIR_TYPE_LOCKBYTES 3 #define CDF_DIR_TYPE_PROPERTY 4 #define CDF_DIR_TYPE_ROOT_STORAGE 5 - uint8_t d_color; + uint8_t d_color; #define CDF_DIR_COLOR_READ 0 #define CDF_DIR_COLOR_BLACK 1 - cdf_dirid_t d_left_child; - cdf_dirid_t d_right_child; - cdf_dirid_t d_storage; - uint64_t d_storage_uuid[2]; - uint32_t d_flags; - cdf_timestamp_t d_created; - cdf_timestamp_t d_modified; - cdf_secid_t d_stream_first_sector; - uint32_t d_size; - uint32_t d_unused0; + cdf_dirid_t d_left_child; + cdf_dirid_t d_right_child; + cdf_dirid_t d_storage; + uint64_t d_storage_uuid[2]; + uint32_t d_flags; + cdf_timestamp_t d_created; + cdf_timestamp_t d_modified; + cdf_secid_t d_stream_first_sector; + uint32_t d_size; + uint32_t d_unused0; } cdf_directory_t; #define CDF_DIRECTORY_SIZE 128 typedef struct { - cdf_secid_t *sat_tab; - size_t sat_len; + cdf_secid_t *sat_tab; + size_t sat_len; } cdf_sat_t; typedef struct { - cdf_directory_t *dir_tab; - size_t dir_len; + cdf_directory_t *dir_tab; + size_t dir_len; } cdf_dir_t; typedef struct { - void *sst_tab; - size_t sst_len; - size_t sst_dirlen; + void *sst_tab; + size_t sst_len; + size_t sst_dirlen; } cdf_stream_t; typedef struct { - uint32_t cl_dword; - uint16_t cl_word[2]; - uint8_t cl_two[2]; - uint8_t cl_six[6]; + uint32_t cl_dword; + uint16_t cl_word[2]; + uint8_t cl_two[2]; + uint8_t cl_six[6]; } cdf_classid_t; typedef struct { - uint16_t si_byte_order; - uint16_t si_zero; - uint16_t si_os_version; - uint16_t si_os; - cdf_classid_t si_class; - uint32_t si_count; + uint16_t si_byte_order; + uint16_t si_zero; + uint16_t si_os_version; + uint16_t si_os; + cdf_classid_t si_class; + uint32_t si_count; } cdf_summary_info_header_t; #define CDF_SECTION_DECLARATION_OFFSET 0x1c typedef struct { - cdf_classid_t sd_class; - uint32_t sd_offset; + cdf_classid_t sd_class; + uint32_t sd_offset; } cdf_section_declaration_t; typedef struct { - uint32_t sh_len; - uint32_t sh_properties; + uint32_t sh_len; + uint32_t sh_properties; } cdf_section_header_t; typedef struct { - uint32_t pi_id; - uint32_t pi_type; - union { - uint16_t _pi_u16; - int16_t _pi_s16; - uint32_t _pi_u32; - int32_t _pi_s32; - uint64_t _pi_u64; - int64_t _pi_s64; - cdf_timestamp_t _pi_tp; - struct { - uint32_t s_len; - const char *s_buf; - } _pi_str; - } pi_val; + uint32_t pi_id; + uint32_t pi_type; + union { + uint16_t _pi_u16; + int16_t _pi_s16; + uint32_t _pi_u32; + int32_t _pi_s32; + uint64_t _pi_u64; + int64_t _pi_s64; + cdf_timestamp_t _pi_tp; + struct { + uint32_t s_len; + const char *s_buf; + } _pi_str; + } pi_val; #define pi_u64 pi_val._pi_u64 #define pi_s64 pi_val._pi_s64 #define pi_u32 pi_val._pi_u32 @@ -226,7 +241,7 @@ typedef struct { #define CDF_PROPERTY_SUBJECT 0x00000003 #define CDF_PROPERTY_AUTHOR 0x00000004 #define CDF_PROPERTY_KEYWORDS 0x00000005 -#define CDF_PROPERTY_COMMENTS 0x00000006 +#define CDF_PROPERTY_COMMENTS 0x00000006 #define CDF_PROPERTY_TEMPLATE 0x00000007 #define CDF_PROPERTY_LAST_SAVED_BY 0x00000008 #define CDF_PROPERTY_REVISION_NUMBER 0x00000009 @@ -243,9 +258,9 @@ typedef struct { #define CDF_PROPERTY_LOCALE_ID 0x80000000 typedef struct { - int i_fd; - const unsigned char *i_buf; - size_t i_len; + int i_fd; + const unsigned char *i_buf; + size_t i_len; } cdf_info_t; struct timespec; diff --git a/contrib/file/src/cdf_time.c b/contrib/file/src/cdf_time.c index 14dcfc6f74..9d0b1dfea4 100644 --- a/contrib/file/src/cdf_time.c +++ b/contrib/file/src/cdf_time.c @@ -27,7 +27,7 @@ #include "file.h" #ifndef lint -FILE_RCSID("@(#)$File: cdf_time.c,v 1.8 2009/06/20 20:47:30 christos Exp $") +FILE_RCSID("@(#)$File: cdf_time.c,v 1.9 2010/10/02 15:36:15 christos Exp $") #endif #include @@ -45,12 +45,6 @@ static const int mdays[] = { 31, 28, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31 }; -#ifdef __DJGPP__ -#define timespec timeval -#define tv_nsec tv_usec -#endif - - /* * Return the number of days between jan 01 1601 and jan 01 of year. */ diff --git a/contrib/file/src/compress.c b/contrib/file/src/compress.c index f04ab27d79..9040695050 100644 --- a/contrib/file/src/compress.c +++ b/contrib/file/src/compress.c @@ -35,7 +35,7 @@ #include "file.h" #ifndef lint -FILE_RCSID("@(#)$File: compress.c,v 1.64 2009/05/08 17:41:58 christos Exp $") +FILE_RCSID("@(#)$File: compress.c,v 1.65 2010/07/21 16:47:17 christos Exp $") #endif #include "magic.h" @@ -45,7 +45,9 @@ FILE_RCSID("@(#)$File: compress.c,v 1.64 2009/05/08 17:41:58 christos Exp $") #endif #include #include +#ifndef __MINGW32__ #include +#endif #ifdef HAVE_SYS_WAIT_H #include #endif @@ -79,12 +81,11 @@ private const struct { { "\3757zXZ\0",6,{ "xz", "-cd", NULL }, 1 }, /* XZ Utils */ }; -private size_t ncompr = sizeof(compr) / sizeof(compr[0]); - #define NODATA ((size_t)~0) - private ssize_t swrite(int, const void *, size_t); +#if HAVE_FORK +private size_t ncompr = sizeof(compr) / sizeof(compr[0]); private size_t uncompressbuf(struct magic_set *, int, size_t, const unsigned char *, unsigned char **, size_t); #ifdef BUILTIN_DECOMPRESS @@ -137,7 +138,7 @@ error: ms->flags |= MAGIC_COMPRESS; return rv; } - +#endif /* * `safe' write for sockets and pipes. */ @@ -167,9 +168,12 @@ swrite(int fd, const void *buf, size_t n) * `safe' read for sockets and pipes. */ protected ssize_t -sread(int fd, void *buf, size_t n, int canbepipe) +sread(int fd, void *buf, size_t n, int canbepipe __attribute__ ((unused))) { - ssize_t rv, cnt; + ssize_t rv; +#ifdef FD_ZERO + ssize_t cnt; +#endif #ifdef FIONREAD int t = 0; #endif @@ -236,7 +240,10 @@ file_pipe2file(struct magic_set *ms, int fd, const void *startbuf, { char buf[4096]; ssize_t r; - int tfd, te; + int tfd; +#ifdef HAVE_MKSTEMP + int te; +#endif (void)strlcpy(buf, "/tmp/file.XXXXXX", sizeof buf); #ifndef HAVE_MKSTEMP @@ -294,7 +301,7 @@ file_pipe2file(struct magic_set *ms, int fd, const void *startbuf, } return fd; } - +#if HAVE_FORK #ifdef BUILTIN_DECOMPRESS #define FHCRC (1 << 1) @@ -494,3 +501,4 @@ err: return n; } } +#endif diff --git a/contrib/file/src/elfclass.h b/contrib/file/src/elfclass.h index 27817d0fef..7f3da864a5 100644 --- a/contrib/file/src/elfclass.h +++ b/contrib/file/src/elfclass.h @@ -35,6 +35,7 @@ switch (type) { #ifdef ELFCORE case ET_CORE: + flags |= FLAGS_IS_CORE; if (dophn_core(ms, clazz, swap, fd, (off_t)elf_getu(swap, elfhdr.e_phoff), elf_getu16(swap, elfhdr.e_phnum), diff --git a/contrib/file/src/encoding.c b/contrib/file/src/encoding.c index 0440514e0d..097673adac 100644 --- a/contrib/file/src/encoding.c +++ b/contrib/file/src/encoding.c @@ -35,7 +35,7 @@ #include "file.h" #ifndef lint -FILE_RCSID("@(#)$File: encoding.c,v 1.4 2009/09/13 19:02:22 christos Exp $") +FILE_RCSID("@(#)$File: encoding.c,v 1.5 2010/07/21 16:47:17 christos Exp $") #endif /* lint */ #include "magic.h" @@ -84,15 +84,15 @@ file_encoding(struct magic_set *ms, const unsigned char *buf, size_t nbytes, uni *type = "text"; if (looks_ascii(buf, nbytes, *ubuf, ulen)) { - DPRINTF(("ascii %zu\n", *ulen)); + DPRINTF(("ascii %" SIZE_T_FORMAT "u\n", *ulen)); *code = "ASCII"; *code_mime = "us-ascii"; } else if (looks_utf8_with_BOM(buf, nbytes, *ubuf, ulen) > 0) { - DPRINTF(("utf8/bom %zu\n", *ulen)); + DPRINTF(("utf8/bom %" SIZE_T_FORMAT "u\n", *ulen)); *code = "UTF-8 Unicode (with BOM)"; *code_mime = "utf-8"; } else if (file_looks_utf8(buf, nbytes, *ubuf, ulen) > 1) { - DPRINTF(("utf8 %zu\n", *ulen)); + DPRINTF(("utf8 %" SIZE_T_FORMAT "u\n", *ulen)); *code = "UTF-8 Unicode (with BOM)"; *code = "UTF-8 Unicode"; *code_mime = "utf-8"; @@ -104,24 +104,25 @@ file_encoding(struct magic_set *ms, const unsigned char *buf, size_t nbytes, uni *code = "Big-endian UTF-16 Unicode"; *code_mime = "utf-16be"; } - DPRINTF(("ucs16 %zu\n", *ulen)); + DPRINTF(("ucs16 %" SIZE_T_FORMAT "u\n", *ulen)); } else if (looks_latin1(buf, nbytes, *ubuf, ulen)) { - DPRINTF(("latin1 %zu\n", *ulen)); + DPRINTF(("latin1 %" SIZE_T_FORMAT "u\n", *ulen)); *code = "ISO-8859"; *code_mime = "iso-8859-1"; } else if (looks_extended(buf, nbytes, *ubuf, ulen)) { - DPRINTF(("extended %zu\n", *ulen)); + DPRINTF(("extended %" SIZE_T_FORMAT "u\n", *ulen)); *code = "Non-ISO extended-ASCII"; *code_mime = "unknown-8bit"; } else { from_ebcdic(buf, nbytes, nbuf); if (looks_ascii(nbuf, nbytes, *ubuf, ulen)) { - DPRINTF(("ebcdic %zu\n", *ulen)); + DPRINTF(("ebcdic %" SIZE_T_FORMAT "u\n", *ulen)); *code = "EBCDIC"; *code_mime = "ebcdic"; } else if (looks_latin1(nbuf, nbytes, *ubuf, ulen)) { - DPRINTF(("ebcdic/international %zu\n", *ulen)); + DPRINTF(("ebcdic/international %" SIZE_T_FORMAT "u\n", + *ulen)); *code = "International EBCDIC"; *code_mime = "ebcdic"; } else { /* Doesn't look like text at all */ diff --git a/contrib/file/src/file.c b/contrib/file/src/file.c index 3b73c5662e..89769f5e54 100644 --- a/contrib/file/src/file.c +++ b/contrib/file/src/file.c @@ -32,7 +32,7 @@ #include "file.h" #ifndef lint -FILE_RCSID("@(#)$File: file.c,v 1.136 2009/12/06 23:18:04 rrt Exp $") +FILE_RCSID("@(#)$File: file.c,v 1.140 2010/11/30 14:58:53 rrt Exp $") #endif /* lint */ #include "magic.h" @@ -73,15 +73,16 @@ int getopt_long(int argc, char * const *argv, const char *optstring, const struc #include "patchlevel.h" #ifdef S_IFLNK -#define FILE_FLAGS "-bchikLNnprsvz0" +#define FILE_FLAGS "-bchikLlNnprsvz0" #else -#define FILE_FLAGS "-bcikNnprsvz0" +#define FILE_FLAGS "-bciklNnprsvz0" #endif # define USAGE \ "Usage: %s [" FILE_FLAGS \ "] [--apple] [--mime-encoding] [--mime-type]\n" \ - " [-e testname] [-F separator] [-f namefile] [-m magicfiles] file ...\n" \ + " [-e testname] [-F separator] [-f namefile] [-m magicfiles] " \ + "file ...\n" \ " %s -C [-m magicfiles]\n" \ " %s [--help]\n" @@ -106,7 +107,7 @@ private const struct option long_options[] = { #undef OPT_LONGONLY {0, 0, NULL, 0} }; -#define OPTSTRING "bcCde:f:F:hikLm:nNprsvz0" +#define OPTSTRING "bcCde:f:F:hiklLm:nNprsvz0" private const struct { const char *name; @@ -120,6 +121,7 @@ private const struct { { "encoding", MAGIC_NO_CHECK_ENCODING }, { "soft", MAGIC_NO_CHECK_SOFT }, { "tar", MAGIC_NO_CHECK_TAR }, + { "text", MAGIC_NO_CHECK_TEXT }, /* synonym for ascii */ { "tokens", MAGIC_NO_CHECK_TOKENS }, }; @@ -227,6 +229,9 @@ main(int argc, char *argv[]) case 'k': flags |= MAGIC_CONTINUE; break; + case 'l': + action = FILE_LIST; + break; case 'm': magicfile = optarg; break; @@ -248,7 +253,7 @@ main(int argc, char *argv[]) flags |= MAGIC_DEVICES; break; case 'v': - if (magicfile == NULL) + if (magicfile == NULL) magicfile = magic_getpath(magicfile, action); (void)fprintf(stderr, "%s-%d.%.2d\n", progname, FILE_VERSION_MAJOR, patchlevel); @@ -281,6 +286,7 @@ main(int argc, char *argv[]) switch(action) { case FILE_CHECK: case FILE_COMPILE: + case FILE_LIST: /* * Don't try to check/compile ~/.magic unless we explicitly * ask for it. @@ -291,8 +297,19 @@ main(int argc, char *argv[]) strerror(errno)); return 1; } - c = action == FILE_CHECK ? magic_check(magic, magicfile) : - magic_compile(magic, magicfile); + switch(action) { + case FILE_CHECK: + c = magic_check(magic, magicfile); + break; + case FILE_COMPILE: + c = magic_compile(magic, magicfile); + break; + case FILE_LIST: + c = magic_list(magic, magicfile); + break; + default: + abort(); + } if (c == -1) { (void)fprintf(stderr, "%s: %s\n", progname, magic_error(magic)); @@ -407,8 +424,7 @@ process(struct magic_set *ms, const char *inname, int wid) (void)printf("%s", std_in ? "/dev/stdin" : inname); if (nulsep) (void)putc('\0', stdout); - else - (void)printf("%s", separator); + (void)printf("%s", separator); (void)printf("%*s ", (int) (nopad ? 0 : (wid - file_mbswidth(inname))), ""); } diff --git a/contrib/file/src/file.h b/contrib/file/src/file.h index c07f2d4540..9f2b7ffb18 100644 --- a/contrib/file/src/file.h +++ b/contrib/file/src/file.h @@ -27,7 +27,7 @@ */ /* * file.h - definitions for file(1) program - * @(#)$File: file.h,v 1.124 2010/01/16 17:45:12 chl Exp $ + * @(#)$File: file.h,v 1.130 2011/01/04 19:29:32 rrt Exp $ */ #ifndef __file_h__ @@ -37,6 +37,18 @@ #include #endif +#ifdef WIN32 + #ifdef _WIN64 + #define SIZE_T_FORMAT "I64" + #else + #define SIZE_T_FORMAT "" + #endif + #define INT64_T_FORMAT "I64" +#else + #define SIZE_T_FORMAT "z" + #define INT64_T_FORMAT "ll" +#endif + #include /* Include that here, to make sure __P gets defined */ #include #include /* For open and flags */ @@ -62,7 +74,7 @@ #define MAGIC "/etc/magic" #endif -#ifdef __EMX__ +#if defined(__EMX__) || defined (WIN32) #define PATHSEP ';' #else #define PATHSEP ':' @@ -104,15 +116,16 @@ #define MAXMAGIS 8192 /* max entries in any one magic file or directory */ #define MAXDESC 64 /* max leng of text description/MIME type */ -#define MAXstring 32 /* max leng of "string" types */ +#define MAXstring 64 /* max leng of "string" types */ #define MAGICNO 0xF11E041C -#define VERSIONNO 7 -#define FILE_MAGICSIZE 200 +#define VERSIONNO 8 +#define FILE_MAGICSIZE 232 #define FILE_LOAD 0 #define FILE_CHECK 1 #define FILE_COMPILE 2 +#define FILE_LIST 3 union VALUETYPE { uint8_t b; @@ -265,11 +278,11 @@ struct magic { #define str_flags _u._s._flags /* Words 9-16 */ union VALUETYPE value; /* either number or string */ - /* Words 17-24 */ + /* Words 17-32 */ char desc[MAXDESC]; /* description */ - /* Words 25-32 */ + /* Words 33-48 */ char mimetype[MAXDESC]; /* MIME type */ - /* Words 33-34 */ + /* Words 49-50 */ char apple[8]; }; @@ -281,6 +294,15 @@ struct magic { #define REGEX_OFFSET_START BIT(4) #define STRING_TEXTTEST BIT(5) #define STRING_BINTEST BIT(6) +#define PSTRING_1_BE BIT(7) +#define PSTRING_1_LE BIT(7) +#define PSTRING_2_BE BIT(8) +#define PSTRING_2_LE BIT(9) +#define PSTRING_4_BE BIT(10) +#define PSTRING_4_LE BIT(11) +#define PSTRING_LEN \ + (PSTRING_1_BE|PSTRING_2_LE|PSTRING_2_BE|PSTRING_4_LE|PSTRING_4_BE) +#define PSTRING_LENGTH_INCLUDES_ITSELF BIT(12) #define CHAR_COMPACT_WHITESPACE 'W' #define CHAR_COMPACT_OPTIONAL_WHITESPACE 'w' #define CHAR_IGNORE_LOWERCASE 'c' @@ -288,6 +310,13 @@ struct magic { #define CHAR_REGEX_OFFSET_START 's' #define CHAR_TEXTTEST 't' #define CHAR_BINTEST 'b' +#define CHAR_PSTRING_1_BE 'B' +#define CHAR_PSTRING_1_LE 'B' +#define CHAR_PSTRING_2_BE 'H' +#define CHAR_PSTRING_2_LE 'h' +#define CHAR_PSTRING_4_BE 'L' +#define CHAR_PSTRING_4_LE 'l' +#define CHAR_PSTRING_LENGTH_INCLUDES_ITSELF 'J' #define STRING_IGNORE_CASE (STRING_IGNORE_LOWERCASE|STRING_IGNORE_UPPERCASE) #define STRING_DEFAULT_RANGE 100 @@ -364,8 +393,10 @@ protected int file_tryelf(struct magic_set *, int, const unsigned char *, size_t); protected int file_trycdf(struct magic_set *, int, const unsigned char *, size_t); +#if HAVE_FORK protected int file_zmagic(struct magic_set *, int, const char *, const unsigned char *, size_t); +#endif protected int file_ascmagic(struct magic_set *, const unsigned char *, size_t); protected int file_ascmagic_with_encoding(struct magic_set *, const unsigned char *, size_t, unichar *, size_t, const char *, @@ -396,6 +427,8 @@ protected ssize_t sread(int, void *, size_t, int); protected int file_check_mem(struct magic_set *, unsigned int); protected int file_looks_utf8(const unsigned char *, size_t, unichar *, size_t *); +protected size_t file_pstring_length_size(const struct magic *); +protected size_t file_pstring_get_length(const struct magic *, const char *); #ifdef __EMX__ protected int file_os2_apptype(struct magic_set *, const char *, const void *, size_t); diff --git a/contrib/file/src/file_opts.h b/contrib/file/src/file_opts.h index 1a73e8732b..bb8d0a0a5d 100644 --- a/contrib/file/src/file_opts.h +++ b/contrib/file/src/file_opts.h @@ -33,6 +33,7 @@ OPT_LONGONLY("mime-type", 0, " output the MIME type\n") OPT_LONGONLY("mime-encoding", 0, " output the MIME encoding\n") OPT('k', "keep-going", 0, " don't stop at the first match\n") #ifdef S_IFLNK +OPT('l', "list", 0, " list magic strength\n") OPT('L', "dereference", 0, " follow symlinks (default)\n") OPT('h', "no-dereference", 0, " don't follow symlinks\n") #endif diff --git a/contrib/file/src/fsmagic.c b/contrib/file/src/fsmagic.c index 537fb14d40..b80d74a903 100644 --- a/contrib/file/src/fsmagic.c +++ b/contrib/file/src/fsmagic.c @@ -32,7 +32,7 @@ #include "file.h" #ifndef lint -FILE_RCSID("@(#)$File: fsmagic.c,v 1.60 2009/05/08 17:41:59 christos Exp $") +FILE_RCSID("@(#)$File: fsmagic.c,v 1.62 2010/09/20 20:16:08 rrt Exp $") #endif /* lint */ #include "magic.h" @@ -59,7 +59,7 @@ FILE_RCSID("@(#)$File: fsmagic.c,v 1.60 2009/05/08 17:41:59 christos Exp $") # define minor(dev) ((dev) & 0xff) #endif #undef HAVE_MAJOR - +#ifdef S_IFLNK private int bad_link(struct magic_set *ms, int err, char *buf) { @@ -83,7 +83,7 @@ bad_link(struct magic_set *ms, int err, char *buf) } return 1; } - +#endif private int handle_mime(struct magic_set *ms, int mime, const char *str) { @@ -134,7 +134,8 @@ file_fsmagic(struct magic_set *ms, const char *fn, struct stat *sb) if (file_printf(ms, "cannot open `%s' (%s)", fn, strerror(errno)) == -1) return -1; - return 1; + ms->event_flags |= EVENT_HAD_ERR; + return -1; } if (!mime) { diff --git a/contrib/file/src/funcs.c b/contrib/file/src/funcs.c index 2397417369..e23a72a65c 100644 --- a/contrib/file/src/funcs.c +++ b/contrib/file/src/funcs.c @@ -27,7 +27,7 @@ #include "file.h" #ifndef lint -FILE_RCSID("@(#)$File: funcs.c,v 1.54 2009/05/08 17:41:59 christos Exp $") +FILE_RCSID("@(#)$File: funcs.c,v 1.55 2010/07/21 16:47:17 christos Exp $") #endif /* lint */ #include "magic.h" @@ -103,7 +103,7 @@ file_error_core(struct magic_set *ms, int error, const char *f, va_list va, if (lineno != 0) { free(ms->o.buf); ms->o.buf = NULL; - file_printf(ms, "line %zu: ", lineno); + file_printf(ms, "line %" SIZE_T_FORMAT "u: ", lineno); } file_vprintf(ms, f, va); if (error > 0) @@ -138,7 +138,8 @@ file_magerror(struct magic_set *ms, const char *f, ...) protected void file_oomem(struct magic_set *ms, size_t len) { - file_error(ms, errno, "cannot allocate %zu bytes", len); + file_error(ms, errno, "cannot allocate %" SIZE_T_FORMAT "u bytes", + len); } protected void @@ -155,8 +156,8 @@ file_badread(struct magic_set *ms) #ifndef COMPILE_ONLY protected int -file_buffer(struct magic_set *ms, int fd, const char *inname, const void *buf, - size_t nb) +file_buffer(struct magic_set *ms, int fd, const char *inname __attribute__ ((unused)), + const void *buf, size_t nb) { int m = 0, rv = 0, looks_text = 0; int mime = ms->flags & MAGIC_MIME; @@ -200,7 +201,7 @@ file_buffer(struct magic_set *ms, int fd, const char *inname, const void *buf, } } #endif - +#if HAVE_FORK /* try compression stuff */ if ((ms->flags & MAGIC_NO_CHECK_COMPRESS) == 0) if ((m = file_zmagic(ms, fd, inname, ubuf, nb)) != 0) { @@ -208,7 +209,7 @@ file_buffer(struct magic_set *ms, int fd, const char *inname, const void *buf, (void)fprintf(stderr, "zmagic %d\n", m); goto done; } - +#endif /* Check if we have a tar file */ if ((ms->flags & MAGIC_NO_CHECK_TAR) == 0) if ((m = file_is_tar(ms, ubuf, nb)) != 0) { diff --git a/contrib/file/src/is_tar.c b/contrib/file/src/is_tar.c index f962edbd8e..876c631bfd 100644 --- a/contrib/file/src/is_tar.c +++ b/contrib/file/src/is_tar.c @@ -2,7 +2,7 @@ * Copyright (c) Ian F. Darwin 1986-1995. * Software written by Ian F. Darwin and others; * maintained 1995-present by Christos Zoulas and others. - * + * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: @@ -12,7 +12,7 @@ * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. - * + * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE @@ -40,7 +40,7 @@ #include "file.h" #ifndef lint -FILE_RCSID("@(#)$File: is_tar.c,v 1.36 2009/02/03 20:27:51 christos Exp $") +FILE_RCSID("@(#)$File: is_tar.c,v 1.37 2010/11/30 14:58:53 rrt Exp $") #endif #include "magic.h" @@ -83,8 +83,8 @@ file_is_tar(struct magic_set *ms, const unsigned char *buf, size_t nbytes) } /* - * Return - * 0 if the checksum is bad (i.e., probably not a tar archive), + * Return + * 0 if the checksum is bad (i.e., probably not a tar archive), * 1 for old UNIX tar file, * 2 for Unix Std (POSIX) tar file, * 3 for GNU tar file. @@ -95,7 +95,7 @@ is_tar(const unsigned char *buf, size_t nbytes) const union record *header = (const union record *)(const void *)buf; int i; int sum, recsum; - const char *p; + const unsigned char *p; if (nbytes < sizeof(union record)) return 0; @@ -104,25 +104,20 @@ is_tar(const unsigned char *buf, size_t nbytes) sum = 0; p = header->charptr; - for (i = sizeof(union record); --i >= 0;) { - /* - * We cannot use unsigned char here because of old compilers, - * e.g. V7. - */ - sum += 0xFF & *p++; - } + for (i = sizeof(union record); --i >= 0;) + sum += *p++; /* Adjust checksum to count the "chksum" field as blanks. */ for (i = sizeof(header->header.chksum); --i >= 0;) - sum -= 0xFF & header->header.chksum[i]; - sum += ' '* sizeof header->header.chksum; + sum -= header->header.chksum[i]; + sum += ' ' * sizeof header->header.chksum; if (sum != recsum) return 0; /* Not a tar archive */ - - if (strcmp(header->header.magic, GNUTMAGIC) == 0) + + if (strcmp(header->header.magic, GNUTMAGIC) == 0) return 3; /* GNU Unix Standard tar archive */ - if (strcmp(header->header.magic, TMAGIC) == 0) + if (strcmp(header->header.magic, TMAGIC) == 0) return 2; /* Unix Standard tar archive */ return 1; /* Old fashioned tar archive */ @@ -132,7 +127,7 @@ is_tar(const unsigned char *buf, size_t nbytes) /* * Quick and dirty octal conversion. * - * Result is -1 if the field is invalid (all blank, or nonoctal). + * Result is -1 if the field is invalid (all blank, or non-octal). */ private int from_oct(int digs, const char *where) @@ -145,13 +140,13 @@ from_oct(int digs, const char *where) return -1; /* All blank field */ } value = 0; - while (digs > 0 && isodigit(*where)) { /* Scan til nonoctal */ + while (digs > 0 && isodigit(*where)) { /* Scan til non-octal */ value = (value << 3) | (*where++ - '0'); --digs; } if (digs > 0 && *where && !isspace((unsigned char)*where)) - return -1; /* Ended on non-space/nul */ + return -1; /* Ended on non-(space/NUL) */ return value; } diff --git a/contrib/file/src/magic.c b/contrib/file/src/magic.c index d49c29a3d8..bcb7000e2c 100644 --- a/contrib/file/src/magic.c +++ b/contrib/file/src/magic.c @@ -25,10 +25,15 @@ * SUCH DAMAGE. */ +#ifdef WIN32 +#include +#include +#endif + #include "file.h" #ifndef lint -FILE_RCSID("@(#)$File: magic.c,v 1.65 2009/09/14 17:50:38 christos Exp $") +FILE_RCSID("@(#)$File: magic.c,v 1.69 2010/09/20 14:14:49 christos Exp $") #endif /* lint */ #include "magic.h" @@ -81,14 +86,33 @@ private const char *file_or_fd(struct magic_set *, const char *, int); #define STDIN_FILENO 0 #endif +#ifdef WIN32 +BOOL WINAPI DllMain(HINSTANCE hinstDLL, + DWORD fdwReason __attribute__((__unused__)), + LPVOID lpvReserved __attribute__((__unused__))); + +CHAR dllpath[MAX_PATH + 1] = { 0 }; + +BOOL WINAPI DllMain(HINSTANCE hinstDLL, + DWORD fdwReason __attribute__((__unused__)), + LPVOID lpvReserved __attribute__((__unused__))) +{ + if (dllpath[0] == 0 && + GetModuleFileNameA(hinstDLL, dllpath, MAX_PATH) != 0) + PathRemoveFileSpecA(dllpath); + return TRUE; +} +#endif + private const char * get_default_magic(void) { - static const char hmagic[] = "/.magic"; + static const char hmagic[] = "/.magic/magic.mgc"; static char default_magic[2 * MAXPATHLEN + 2]; char *home; - char hmagicpath[MAXPATHLEN + 1]; + char hmagicpath[MAXPATHLEN + 1] = { 0 }; +#ifndef WIN32 if ((home = getenv("HOME")) == NULL) return MAGIC; @@ -99,6 +123,60 @@ get_default_magic(void) (void)snprintf(default_magic, sizeof(default_magic), "%s:%s", hmagicpath, MAGIC); +#else + char *hmagicp = hmagicpath; + char tmppath[MAXPATHLEN + 1] = { 0 }; + char *hmagicend = &hmagicpath[sizeof(hmagicpath) - 1]; + static const char pathsep[] = { PATHSEP, '\0' }; + +#define APPENDPATH() \ + if (access(tmppath, R_OK) != -1) + hmagicp += snprintf(hmagicp, hmagicend - hmagicp, \ + "%s%s", hmagicp == hmagicpath ? "" : pathsep, tmppath) + /* First, try to get user-specific magic file */ + if ((home = getenv("LOCALAPPDATA")) == NULL) { + if ((home = getenv("USERPROFILE")) != NULL) + (void)snprintf(tmppath, sizeof(tmppath), + "%s/Local Settings/Application Data%s", home, + hmagic); + } else { + (void)snprintf(tmppath, sizeof(tmppath), "%s%s", + home, hmagic); + } + if (tmppath[0] != '\0') { + APPENDPATH(); + } + + /* Second, try to get a magic file from Common Files */ + if ((home = getenv("COMMONPROGRAMFILES")) != NULL) { + (void)snprintf(tmppath, sizeof(tmppath), "%s%s", home, hmagic); + APPENDPATH(); + } + + + /* Third, try to get magic file relative to dll location */ + if (dllpath[0] != 0) { + if (strlen(dllpath) > 3 && + stricmp(&dllpath[strlen(dllpath) - 3], "bin") == 0) { + (void)snprintf(tmppath, sizeof(tmppath), + "%s/../share/misc/magic.mgc", dllpath); + APPENDPATH(); + } else { + (void)snprintf(tmppath, sizeof(tmppath), + "%s/share/misc/magic.mgc", dllpath); + APPENDPATH() + else { + (void)snprintf(tmppath, sizeof(tmppath), + "%s/magic.mgc", dllpath); + APPENDPATH(); + } + } + } + + /* Don't put MAGIC constant - it likely points to a file within MSys + tree */ + (void)strlcpy(default_magic, hmagicpath, sizeof(default_magic)); +#endif return default_magic; } @@ -225,6 +303,14 @@ magic_check(struct magic_set *ms, const char *magicfile) return ml ? 0 : -1; } +public int +magic_list(struct magic_set *ms, const char *magicfile) +{ + struct mlist *ml = file_apprentice(ms, magicfile, FILE_LIST); + free_mlist(ml); + return ml ? 0 : -1; +} + private void close_and_restore(const struct magic_set *ms, const char *name, int fd, const struct stat *sb) @@ -315,7 +401,9 @@ file_or_fd(struct magic_set *ms, const char *inname, int fd) int flags = O_RDONLY|O_BINARY; if (stat(inname, &sb) == 0 && S_ISFIFO(sb.st_mode)) { +#ifdef O_NONBLOCK flags |= O_NONBLOCK; +#endif ispipe = 1; } diff --git a/contrib/file/src/magic.h b/contrib/file/src/magic.h index 765ff2be35..d87523deef 100644 --- a/contrib/file/src/magic.h +++ b/contrib/file/src/magic.h @@ -43,6 +43,7 @@ #define MAGIC_MIME_ENCODING 0x000400 /* Return the MIME encoding */ #define MAGIC_MIME (MAGIC_MIME_TYPE|MAGIC_MIME_ENCODING) #define MAGIC_APPLE 0x000800 /* Return the Apple creator and type */ + #define MAGIC_NO_CHECK_COMPRESS 0x001000 /* Don't check for compressed files */ #define MAGIC_NO_CHECK_TAR 0x002000 /* Don't check for tar files */ #define MAGIC_NO_CHECK_SOFT 0x004000 /* Don't check magic entries */ @@ -53,6 +54,9 @@ #define MAGIC_NO_CHECK_TOKENS 0x100000 /* Don't check tokens */ #define MAGIC_NO_CHECK_ENCODING 0x200000 /* Don't check text encodings */ +/* No built-in tests; only consult the magic file */ +#define MAGIC_NO_CHECK_BUILTIN 0x3fb000 + /* Defined for backwards compatibility (renamed) */ #define MAGIC_NO_CHECK_ASCII MAGIC_NO_CHECK_TEXT @@ -80,6 +84,7 @@ int magic_setflags(magic_t, int); int magic_load(magic_t, const char *); int magic_compile(magic_t, const char *); int magic_check(magic_t, const char *); +int magic_list(magic_t, const char *); int magic_errno(magic_t); #ifdef __cplusplus diff --git a/contrib/file/src/names.h b/contrib/file/src/names.h index 2682edcc3a..f6c13ef978 100644 --- a/contrib/file/src/names.h +++ b/contrib/file/src/names.h @@ -32,7 +32,7 @@ * appear at fixed offsets into the file. Don't make HOWMANY * too high unless you have a very fast CPU. * - * $File: names.h,v 1.32 2008/02/11 00:19:29 rrt Exp $ + * $File: names.h,v 1.33 2010/10/08 21:58:44 christos Exp $ */ /* @@ -115,59 +115,62 @@ static const struct { */ static const struct names { char name[14]; - short type; + unsigned char type; + unsigned char score; + } names[] = { /* These must be sorted by eye for optimal hit rate */ /* Add to this list only after substantial meditation */ - {"msgid", L_PO}, - {"dnl", L_M4}, - {"import", L_JAVA}, - {"\"libhdr\"", L_BCPL}, - {"\"LIBHDR\"", L_BCPL}, - {"//", L_CC}, - {"template", L_CC}, - {"virtual", L_CC}, - {"class", L_CC}, - {"public:", L_CC}, - {"private:", L_CC}, - {"/*", L_C}, /* must precede "The", "the", etc. */ - {"#include", L_C}, - {"char", L_C}, - {"The", L_ENG}, - {"the", L_ENG}, - {"double", L_C}, - {"extern", L_C}, - {"float", L_C}, - {"struct", L_C}, - {"union", L_C}, - {"CFLAGS", L_MAKE}, - {"LDFLAGS", L_MAKE}, - {"all:", L_MAKE}, - {".PRECIOUS", L_MAKE}, - {".ascii", L_MACH}, - {".asciiz", L_MACH}, - {".byte", L_MACH}, - {".even", L_MACH}, - {".globl", L_MACH}, - {".text", L_MACH}, - {"clr", L_MACH}, - {"(input,", L_PAS}, - {"program", L_PAS}, - {"record", L_PAS}, - {"dcl", L_PLI}, - {"Received:", L_MAIL}, - {">From", L_MAIL}, - {"Return-Path:",L_MAIL}, - {"Cc:", L_MAIL}, - {"Newsgroups:", L_NEWS}, - {"Path:", L_NEWS}, - {"Organization:",L_NEWS}, - {"href=", L_HTML}, - {"HREF=", L_HTML}, - {"From", L_MAIL, 2 }, + {"Return-Path:",L_MAIL, 2 }, + {"Cc:", L_MAIL, 2 }, + {"Newsgroups:", L_NEWS, 2 }, + {"Path:", L_NEWS, 2 }, + {"Organization:",L_NEWS, 2 }, + {"href=", L_HTML, 2 }, + {"HREF=", L_HTML, 2 }, + {" @@ -120,7 +120,7 @@ file_mdump(struct magic *m) case FILE_BEQUAD: case FILE_LEQUAD: case FILE_QUAD: - (void) fprintf(stderr, "%lld", + (void) fprintf(stderr, "%" INT64_T_FORMAT "d", (unsigned long long)m->value.q); break; case FILE_PSTRING: @@ -198,7 +198,6 @@ file_magwarn(struct magic_set *ms, const char *f, ...) (void) fputc('\n', stderr); } -#ifndef COMPILE_ONLY protected const char * file_fmttime(uint32_t v, int local) { @@ -239,4 +238,3 @@ file_fmttime(uint32_t v, int local) out: return "*Invalid time*"; } -#endif diff --git a/contrib/file/src/readcdf.c b/contrib/file/src/readcdf.c index 52cf579c93..b69276a4bc 100644 --- a/contrib/file/src/readcdf.c +++ b/contrib/file/src/readcdf.c @@ -26,7 +26,7 @@ #include "file.h" #ifndef lint -FILE_RCSID("@(#)$File: readcdf.c,v 1.22 2010/01/20 01:36:55 christos Exp $") +FILE_RCSID("@(#)$File: readcdf.c,v 1.23 2010/02/20 15:19:53 rrt Exp $") #endif #include @@ -44,241 +44,241 @@ private int cdf_file_property_info(struct magic_set *ms, const cdf_property_info_t *info, size_t count) { - size_t i; - cdf_timestamp_t tp; - struct timespec ts; - char buf[64]; - const char *str = "vnd.ms-office"; - const char *s; - int len; + size_t i; + cdf_timestamp_t tp; + struct timespec ts; + char buf[64]; + const char *str = "vnd.ms-office"; + const char *s; + int len; - for (i = 0; i < count; i++) { - cdf_print_property_name(buf, sizeof(buf), info[i].pi_id); - switch (info[i].pi_type) { - case CDF_NULL: - break; - case CDF_SIGNED16: - if (NOTMIME(ms) && file_printf(ms, ", %s: %hd", buf, - info[i].pi_s16) == -1) - return -1; - break; - case CDF_SIGNED32: - if (NOTMIME(ms) && file_printf(ms, ", %s: %d", buf, - info[i].pi_s32) == -1) - return -1; - break; - case CDF_UNSIGNED32: - if (NOTMIME(ms) && file_printf(ms, ", %s: %u", buf, - info[i].pi_u32) == -1) - return -1; - break; - case CDF_LENGTH32_STRING: - case CDF_LENGTH32_WSTRING: - len = info[i].pi_str.s_len; - if (len > 1) { - char vbuf[1024]; - size_t j, k = 1; + for (i = 0; i < count; i++) { + cdf_print_property_name(buf, sizeof(buf), info[i].pi_id); + switch (info[i].pi_type) { + case CDF_NULL: + break; + case CDF_SIGNED16: + if (NOTMIME(ms) && file_printf(ms, ", %s: %hd", buf, + info[i].pi_s16) == -1) + return -1; + break; + case CDF_SIGNED32: + if (NOTMIME(ms) && file_printf(ms, ", %s: %d", buf, + info[i].pi_s32) == -1) + return -1; + break; + case CDF_UNSIGNED32: + if (NOTMIME(ms) && file_printf(ms, ", %s: %u", buf, + info[i].pi_u32) == -1) + return -1; + break; + case CDF_LENGTH32_STRING: + case CDF_LENGTH32_WSTRING: + len = info[i].pi_str.s_len; + if (len > 1) { + char vbuf[1024]; + size_t j, k = 1; - if (info[i].pi_type == CDF_LENGTH32_WSTRING) - k++; - s = info[i].pi_str.s_buf; - for (j = 0; j < sizeof(vbuf) && len--; - j++, s += k) { - if (*s == '\0') - break; - if (isprint((unsigned char)*s)) - vbuf[j] = *s; - } - if (j == sizeof(vbuf)) - --j; - vbuf[j] = '\0'; - if (NOTMIME(ms)) { - if (vbuf[0]) { - if (file_printf(ms, ", %s: %s", - buf, vbuf) == -1) - return -1; - } - } else if (info[i].pi_id == - CDF_PROPERTY_NAME_OF_APPLICATION) { - if (strstr(vbuf, "Word")) - str = "msword"; - else if (strstr(vbuf, "Excel")) - str = "vnd.ms-excel"; - else if (strstr(vbuf, "Powerpoint")) - str = "vnd.ms-powerpoint"; - else if (strstr(vbuf, - "Crystal Reports")) - str = "x-rpt"; - } - } - break; - case CDF_FILETIME: - tp = info[i].pi_tp; - if (tp != 0) { - if (tp < 1000000000000000LL) { - char tbuf[64]; - cdf_print_elapsed_time(tbuf, - sizeof(tbuf), tp); - if (NOTMIME(ms) && file_printf(ms, - ", %s: %s", buf, tbuf) == -1) - return -1; - } else { - char *c, *ec; - cdf_timestamp_to_timespec(&ts, tp); - c = ctime(&ts.tv_sec); - if ((ec = strchr(c, '\n')) != NULL) - *ec = '\0'; + if (info[i].pi_type == CDF_LENGTH32_WSTRING) + k++; + s = info[i].pi_str.s_buf; + for (j = 0; j < sizeof(vbuf) && len--; + j++, s += k) { + if (*s == '\0') + break; + if (isprint((unsigned char)*s)) + vbuf[j] = *s; + } + if (j == sizeof(vbuf)) + --j; + vbuf[j] = '\0'; + if (NOTMIME(ms)) { + if (vbuf[0]) { + if (file_printf(ms, ", %s: %s", + buf, vbuf) == -1) + return -1; + } + } else if (info[i].pi_id == + CDF_PROPERTY_NAME_OF_APPLICATION) { + if (strstr(vbuf, "Word")) + str = "msword"; + else if (strstr(vbuf, "Excel")) + str = "vnd.ms-excel"; + else if (strstr(vbuf, "Powerpoint")) + str = "vnd.ms-powerpoint"; + else if (strstr(vbuf, + "Crystal Reports")) + str = "x-rpt"; + } + } + break; + case CDF_FILETIME: + tp = info[i].pi_tp; + if (tp != 0) { + if (tp < 1000000000000000LL) { + char tbuf[64]; + cdf_print_elapsed_time(tbuf, + sizeof(tbuf), tp); + if (NOTMIME(ms) && file_printf(ms, + ", %s: %s", buf, tbuf) == -1) + return -1; + } else { + char *c, *ec; + cdf_timestamp_to_timespec(&ts, tp); + c = ctime(&ts.tv_sec); + if ((ec = strchr(c, '\n')) != NULL) + *ec = '\0'; - if (NOTMIME(ms) && file_printf(ms, - ", %s: %s", buf, c) == -1) - return -1; - } - } - break; - case CDF_CLIPBOARD: - break; - default: - return -1; - } - } - if (!NOTMIME(ms)) { - if (file_printf(ms, "application/%s", str) == -1) - return -1; - } - return 1; + if (NOTMIME(ms) && file_printf(ms, + ", %s: %s", buf, c) == -1) + return -1; + } + } + break; + case CDF_CLIPBOARD: + break; + default: + return -1; + } + } + if (!NOTMIME(ms)) { + if (file_printf(ms, "application/%s", str) == -1) + return -1; + } + return 1; } private int cdf_file_summary_info(struct magic_set *ms, const cdf_stream_t *sst) { - cdf_summary_info_header_t si; - cdf_property_info_t *info; - size_t count; - int m; + cdf_summary_info_header_t si; + cdf_property_info_t *info; + size_t count; + int m; - if (cdf_unpack_summary_info(sst, &si, &info, &count) == -1) - return -1; + if (cdf_unpack_summary_info(sst, &si, &info, &count) == -1) + return -1; - if (NOTMIME(ms)) { - if (file_printf(ms, "CDF V2 Document") == -1) - return -1; + if (NOTMIME(ms)) { + if (file_printf(ms, "Composite Document File V2 Document") == -1) + return -1; - if (file_printf(ms, ", %s Endian", - si.si_byte_order == 0xfffe ? "Little" : "Big") == -1) - return -1; - switch (si.si_os) { - case 2: - if (file_printf(ms, ", Os: Windows, Version %d.%d", - si.si_os_version & 0xff, - (uint32_t)si.si_os_version >> 8) == -1) - return -1; - break; - case 1: - if (file_printf(ms, ", Os: MacOS, Version %d.%d", - (uint32_t)si.si_os_version >> 8, - si.si_os_version & 0xff) == -1) - return -1; - break; - default: - if (file_printf(ms, ", Os %d, Version: %d.%d", si.si_os, - si.si_os_version & 0xff, - (uint32_t)si.si_os_version >> 8) == -1) - return -1; - break; - } - } + if (file_printf(ms, ", %s Endian", + si.si_byte_order == 0xfffe ? "Little" : "Big") == -1) + return -1; + switch (si.si_os) { + case 2: + if (file_printf(ms, ", Os: Windows, Version %d.%d", + si.si_os_version & 0xff, + (uint32_t)si.si_os_version >> 8) == -1) + return -1; + break; + case 1: + if (file_printf(ms, ", Os: MacOS, Version %d.%d", + (uint32_t)si.si_os_version >> 8, + si.si_os_version & 0xff) == -1) + return -1; + break; + default: + if (file_printf(ms, ", Os %d, Version: %d.%d", si.si_os, + si.si_os_version & 0xff, + (uint32_t)si.si_os_version >> 8) == -1) + return -1; + break; + } + } - m = cdf_file_property_info(ms, info, count); - free(info); + m = cdf_file_property_info(ms, info, count); + free(info); - return m; + return m; } protected int file_trycdf(struct magic_set *ms, int fd, const unsigned char *buf, size_t nbytes) { - cdf_info_t info; - cdf_header_t h; - cdf_sat_t sat, ssat; - cdf_stream_t sst, scn; - cdf_dir_t dir; - int i; - const char *expn = ""; - const char *corrupt = "corrupt: "; + cdf_info_t info; + cdf_header_t h; + cdf_sat_t sat, ssat; + cdf_stream_t sst, scn; + cdf_dir_t dir; + int i; + const char *expn = ""; + const char *corrupt = "corrupt: "; - info.i_fd = fd; - info.i_buf = buf; - info.i_len = nbytes; - if (ms->flags & MAGIC_APPLE) - return 0; - if (cdf_read_header(&info, &h) == -1) - return 0; + info.i_fd = fd; + info.i_buf = buf; + info.i_len = nbytes; + if (ms->flags & MAGIC_APPLE) + return 0; + if (cdf_read_header(&info, &h) == -1) + return 0; #ifdef CDF_DEBUG - cdf_dump_header(&h); + cdf_dump_header(&h); #endif - if ((i = cdf_read_sat(&info, &h, &sat)) == -1) { - expn = "Can't read SAT"; - goto out0; - } + if ((i = cdf_read_sat(&info, &h, &sat)) == -1) { + expn = "Can't read SAT"; + goto out0; + } #ifdef CDF_DEBUG - cdf_dump_sat("SAT", &sat, CDF_SEC_SIZE(&h)); + cdf_dump_sat("SAT", &sat, CDF_SEC_SIZE(&h)); #endif - if ((i = cdf_read_ssat(&info, &h, &sat, &ssat)) == -1) { - expn = "Can't read SSAT"; - goto out1; - } + if ((i = cdf_read_ssat(&info, &h, &sat, &ssat)) == -1) { + expn = "Can't read SSAT"; + goto out1; + } #ifdef CDF_DEBUG - cdf_dump_sat("SSAT", &ssat, CDF_SHORT_SEC_SIZE(&h)); + cdf_dump_sat("SSAT", &ssat, CDF_SHORT_SEC_SIZE(&h)); #endif - if ((i = cdf_read_dir(&info, &h, &sat, &dir)) == -1) { - expn = "Can't read directory"; - goto out2; - } + if ((i = cdf_read_dir(&info, &h, &sat, &dir)) == -1) { + expn = "Can't read directory"; + goto out2; + } - if ((i = cdf_read_short_stream(&info, &h, &sat, &dir, &sst)) == -1) { - expn = "Cannot read short stream"; - goto out3; - } + if ((i = cdf_read_short_stream(&info, &h, &sat, &dir, &sst)) == -1) { + expn = "Cannot read short stream"; + goto out3; + } #ifdef CDF_DEBUG - cdf_dump_dir(&info, &h, &sat, &ssat, &sst, &dir); + cdf_dump_dir(&info, &h, &sat, &ssat, &sst, &dir); #endif - if ((i = cdf_read_summary_info(&info, &h, &sat, &ssat, &sst, &dir, - &scn)) == -1) { - if (errno == ESRCH) { - corrupt = expn; - expn = "No summary info"; - } else { - expn = "Cannot read summary info"; - } - goto out4; - } + if ((i = cdf_read_summary_info(&info, &h, &sat, &ssat, &sst, &dir, + &scn)) == -1) { + if (errno == ESRCH) { + corrupt = expn; + expn = "No summary info"; + } else { + expn = "Cannot read summary info"; + } + goto out4; + } #ifdef CDF_DEBUG - cdf_dump_summary_info(&h, &scn); + cdf_dump_summary_info(&h, &scn); #endif - if ((i = cdf_file_summary_info(ms, &scn)) == -1) - expn = "Can't expand summary_info"; - free(scn.sst_tab); + if ((i = cdf_file_summary_info(ms, &scn)) == -1) + expn = "Can't expand summary_info"; + free(scn.sst_tab); out4: - free(sst.sst_tab); + free(sst.sst_tab); out3: - free(dir.dir_tab); + free(dir.dir_tab); out2: - free(ssat.sat_tab); + free(ssat.sat_tab); out1: - free(sat.sat_tab); + free(sat.sat_tab); out0: - if (i != 1) { - if (file_printf(ms, "CDF V2 Document") == -1) - return -1; - if (*expn) - if (file_printf(ms, ", %s%s", corrupt, expn) == -1) - return -1; - i = 1; - } - return i; + if (i != 1) { + if (file_printf(ms, "Composite Document File V2 Document") == -1) + return -1; + if (*expn) + if (file_printf(ms, ", %s%s", corrupt, expn) == -1) + return -1; + i = 1; + } + return i; } diff --git a/contrib/file/src/readelf.c b/contrib/file/src/readelf.c index 5915569f3f..ceb657d9e3 100644 --- a/contrib/file/src/readelf.c +++ b/contrib/file/src/readelf.c @@ -27,7 +27,7 @@ #include "file.h" #ifndef lint -FILE_RCSID("@(#)$File: readelf.c,v 1.83 2009/05/13 14:43:10 christos Exp $") +FILE_RCSID("@(#)$File: readelf.c,v 1.86 2010/07/21 16:47:18 christos Exp $") #endif #ifdef BUILTIN_ELF @@ -286,6 +286,7 @@ private const char os_style_names[][8] = { #define FLAGS_DID_CORE 1 #define FLAGS_DID_NOTE 2 #define FLAGS_DID_CORE_STYLE 4 +#define FLAGS_IS_CORE 8 private int dophn_core(struct magic_set *ms, int clazz, int swap, int fd, off_t off, @@ -676,7 +677,7 @@ core: break; default: - if (xnh_type == NT_PRPSINFO) { + if (xnh_type == NT_PRPSINFO && *flags & FLAGS_IS_CORE) { size_t i, j; unsigned char c; /* @@ -738,6 +739,25 @@ core: /* * Well, that worked. */ + + /* + * Try next offsets, in case this match is + * in the middle of a string. + */ + size_t k; + for (k = i + 1 ; k < NOFFSETS ; k++) { + if (prpsoffsets(k) >= prpsoffsets(i)) + continue; + size_t no; + int adjust = 1; + for (no = doff + prpsoffsets(k); + no < doff + prpsoffsets(i); no++) + adjust = adjust + && isprint(nbuf[no]); + if (adjust) + i = k; + } + cname = (unsigned char *) &nbuf[doff + prpsoffsets(i)]; for (cp = cname; *cp && isprint(*cp); cp++) @@ -929,7 +949,8 @@ doshn(struct magic_set *ms, int clazz, int swap, int fd, off_t off, int num, default: if (file_printf(ms, ", with unknown capability " - "0x%llx = 0x%llx", + "0x%" INT64_T_FORMAT "x = 0x%" + INT64_T_FORMAT "x", (unsigned long long)xcap_tag, (unsigned long long)xcap_val) == -1) return -1; @@ -977,12 +998,13 @@ doshn(struct magic_set *ms, int clazz, int swap, int fd, off_t off, int num, } if (cap_hw1) if (file_printf(ms, - " unknown hardware capability 0x%llx", + " unknown hardware capability 0x%" + INT64_T_FORMAT "x", (unsigned long long)cap_hw1) == -1) return -1; } else { if (file_printf(ms, - " hardware capability 0x%llx", + " hardware capability 0x%" INT64_T_FORMAT "x", (unsigned long long)cap_hw1) == -1) return -1; } @@ -998,7 +1020,8 @@ doshn(struct magic_set *ms, int clazz, int swap, int fd, off_t off, int num, cap_sf1 &= ~SF1_SUNW_MASK; if (cap_sf1) if (file_printf(ms, - ", with unknown software capability 0x%llx", + ", with unknown software capability 0x%" + INT64_T_FORMAT "x", (unsigned long long)cap_sf1) == -1) return -1; } diff --git a/contrib/file/src/softmagic.c b/contrib/file/src/softmagic.c index d8a5675317..99a7b52457 100644 --- a/contrib/file/src/softmagic.c +++ b/contrib/file/src/softmagic.c @@ -32,7 +32,7 @@ #include "file.h" #ifndef lint -FILE_RCSID("@(#)$File: softmagic.c,v 1.138 2009/10/19 13:10:20 christos Exp $") +FILE_RCSID("@(#)$File: softmagic.c,v 1.144 2011/01/07 23:22:28 rrt Exp $") #endif /* lint */ #include "magic.h" @@ -144,7 +144,7 @@ match(struct magic_set *ms, struct magic *magic, uint32_t nmagic, default: if (m->type == FILE_INDIRECT) returnval = 1; - + switch (magiccheck(ms, m)) { case -1: return -1; @@ -168,6 +168,8 @@ match(struct magic_set *ms, struct magic *magic, uint32_t nmagic, continue; } + if ((e = handle_annotation(ms, m)) != 0) + return e; /* * If we are going to print something, we'll need to print * a blank before we print something else. @@ -175,8 +177,6 @@ match(struct magic_set *ms, struct magic *magic, uint32_t nmagic, if (*m->desc) { need_separator = 1; printed_something = 1; - if ((e = handle_annotation(ms, m)) != 0) - return e; if (print_sep(ms, firstline) == -1) return -1; } @@ -251,13 +251,13 @@ match(struct magic_set *ms, struct magic *magic, uint32_t nmagic, ms->c.li[cont_level].got_match = 0; break; } + if ((e = handle_annotation(ms, m)) != 0) + return e; /* * If we are going to print something, * make sure that we have a separator first. */ if (*m->desc) { - if ((e = handle_annotation(ms, m)) != 0) - return e; if (!printed_something) { printed_something = 1; if (print_sep(ms, firstline) @@ -449,7 +449,7 @@ mprint(struct magic_set *ms, struct magic *m) return -1; t = ms->offset + strlen(p->s); if (m->type == FILE_PSTRING) - t++; + t += file_pstring_length_size(m); } break; @@ -614,7 +614,7 @@ moffset(struct magic_set *ms, struct magic *m) p->s[strcspn(p->s, "\n")] = '\0'; t = CAST(uint32_t, (ms->offset + strlen(p->s))); if (m->type == FILE_PSTRING) - t++; + t += file_pstring_length_size(m); return t; } @@ -790,28 +790,16 @@ mconvert(struct magic_set *ms, struct magic *m) case FILE_LESTRING16: { /* Null terminate and eat *trailing* return */ p->s[sizeof(p->s) - 1] = '\0'; -#if 0 - /* Why? breaks magic numbers that end with \xa */ - len = strlen(p->s); - if (len-- && p->s[len] == '\n') - p->s[len] = '\0'; -#endif return 1; } case FILE_PSTRING: { - char *ptr1 = p->s, *ptr2 = ptr1 + 1; - size_t len = *p->s; + char *ptr1 = p->s, *ptr2 = ptr1 + file_pstring_length_size(m); + size_t len = file_pstring_get_length(m, ptr1); if (len >= sizeof(p->s)) len = sizeof(p->s) - 1; while (len--) *ptr1++ = *ptr2++; *ptr1 = '\0'; -#if 0 - /* Why? breaks magic numbers that end with \xa */ - len = strlen(p->s); - if (len-- && p->s[len] == '\n') - p->s[len] = '\0'; -#endif return 1; } case FILE_BESHORT: @@ -945,7 +933,7 @@ mcopy(struct magic_set *ms, union VALUETYPE *p, int type, int indir, buf = (const char *)s + offset; end = last = (const char *)s + nbytes; /* mget() guarantees buf <= last */ - for (lines = linecnt, b = buf; lines && + for (lines = linecnt, b = buf; lines && b < end && ((b = CAST(const char *, memchr(c = b, '\n', CAST(size_t, (end - b))))) || (b = CAST(const char *, @@ -1585,7 +1573,7 @@ mget(struct magic_set *ms, const unsigned char *s, case FILE_INDIRECT: if ((ms->flags & (MAGIC_MIME|MAGIC_APPLE)) == 0 && - file_printf(ms, m->desc) == -1) + file_printf(ms, "%s", m->desc) == -1) return -1; if (nbytes < offset) return 0; @@ -1640,8 +1628,9 @@ file_strncmp(const char *s1, const char *s2, size_t len, uint32_t flags) isspace(*a)) { a++; if (isspace(*b++)) { - while (isspace(*b)) - b++; + if (!isspace(*a)) + while (isspace(*b)) + b++; } else { v = 1; @@ -1902,39 +1891,41 @@ magiccheck(struct magic_set *ms, struct magic *m) switch (m->reln) { case 'x': if ((ms->flags & MAGIC_DEBUG) != 0) - (void) fprintf(stderr, "%llu == *any* = 1\n", - (unsigned long long)v); + (void) fprintf(stderr, "%" INT64_T_FORMAT + "u == *any* = 1\n", (unsigned long long)v); matched = 1; break; case '!': matched = v != l; if ((ms->flags & MAGIC_DEBUG) != 0) - (void) fprintf(stderr, "%llu != %llu = %d\n", - (unsigned long long)v, (unsigned long long)l, - matched); + (void) fprintf(stderr, "%" INT64_T_FORMAT "u != %" + INT64_T_FORMAT "u = %d\n", (unsigned long long)v, + (unsigned long long)l, matched); break; case '=': matched = v == l; if ((ms->flags & MAGIC_DEBUG) != 0) - (void) fprintf(stderr, "%llu == %llu = %d\n", - (unsigned long long)v, (unsigned long long)l, - matched); + (void) fprintf(stderr, "%" INT64_T_FORMAT "u == %" + INT64_T_FORMAT "u = %d\n", (unsigned long long)v, + (unsigned long long)l, matched); break; case '>': if (m->flag & UNSIGNED) { matched = v > l; if ((ms->flags & MAGIC_DEBUG) != 0) - (void) fprintf(stderr, "%llu > %llu = %d\n", + (void) fprintf(stderr, "%" INT64_T_FORMAT + "u > %" INT64_T_FORMAT "u = %d\n", (unsigned long long)v, (unsigned long long)l, matched); } else { matched = (int64_t) v > (int64_t) l; if ((ms->flags & MAGIC_DEBUG) != 0) - (void) fprintf(stderr, "%lld > %lld = %d\n", + (void) fprintf(stderr, "%" INT64_T_FORMAT + "d > %" INT64_T_FORMAT "d = %d\n", (long long)v, (long long)l, matched); } break; @@ -1943,32 +1934,38 @@ magiccheck(struct magic_set *ms, struct magic *m) if (m->flag & UNSIGNED) { matched = v < l; if ((ms->flags & MAGIC_DEBUG) != 0) - (void) fprintf(stderr, "%llu < %llu = %d\n", + (void) fprintf(stderr, "%" INT64_T_FORMAT + "u < %" INT64_T_FORMAT "u = %d\n", (unsigned long long)v, (unsigned long long)l, matched); } else { matched = (int64_t) v < (int64_t) l; if ((ms->flags & MAGIC_DEBUG) != 0) - (void) fprintf(stderr, "%lld < %lld = %d\n", - (long long)v, (long long)l, matched); + (void) fprintf(stderr, "%" INT64_T_FORMAT + "d < %" INT64_T_FORMAT "d = %d\n", + (long long)v, (long long)l, matched); } break; case '&': matched = (v & l) == l; if ((ms->flags & MAGIC_DEBUG) != 0) - (void) fprintf(stderr, "((%llx & %llx) == %llx) = %d\n", - (unsigned long long)v, (unsigned long long)l, - (unsigned long long)l, matched); + (void) fprintf(stderr, "((%" INT64_T_FORMAT "x & %" + INT64_T_FORMAT "x) == %" INT64_T_FORMAT + "x) = %d\n", (unsigned long long)v, + (unsigned long long)l, (unsigned long long)l, + matched); break; case '^': matched = (v & l) != l; if ((ms->flags & MAGIC_DEBUG) != 0) - (void) fprintf(stderr, "((%llx & %llx) != %llx) = %d\n", - (unsigned long long)v, (unsigned long long)l, - (unsigned long long)l, matched); + (void) fprintf(stderr, "((%" INT64_T_FORMAT "x & %" + INT64_T_FORMAT "x) != %" INT64_T_FORMAT + "x) = %d\n", (unsigned long long)v, + (unsigned long long)l, (unsigned long long)l, + matched); break; default: diff --git a/contrib/file/src/tar.h b/contrib/file/src/tar.h index fa2390ab7d..854d4552e6 100644 --- a/contrib/file/src/tar.h +++ b/contrib/file/src/tar.h @@ -2,7 +2,7 @@ * Copyright (c) Ian F. Darwin 1986-1995. * Software written by Ian F. Darwin and others; * maintained 1995-present by Christos Zoulas and others. - * + * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: @@ -12,7 +12,7 @@ * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. - * + * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE @@ -32,7 +32,7 @@ * * Created 25 August 1985 by John Gilmore, ihnp4!hoptoad!gnu. * - * $File: tar.h,v 1.12 2008/02/07 00:58:52 christos Exp $ # checkin only + * $File: tar.h,v 1.13 2010/11/30 14:58:53 rrt Exp $ # checkin only */ /* @@ -49,7 +49,7 @@ #define TGNMLEN 32 union record { - char charptr[RECORDSIZE]; + unsigned char charptr[RECORDSIZE]; struct header { char name[NAMSIZ]; char mode[8];