1) Verifying files and file types is hard! There is a project that is available that tries to keep track of the magic numbers for each file type. A quick scroll through there and you will notice that in some cases, multiple extensions use the same magic numbers. In other cases, different versions of the application use different file types and magic numbers. It is a mess!
2) The verifying that can be done in the application often isn't enough. We have to remember that we cannot rely on the MVC application to solve all problems. Even the server side checks that can be implemented are not full proof. Furthermore, this only verifies that a file is of the type it claims to be (or that you are looking for). One would still have the problem of ensuring the file was actually "safe" for consumption by the target application. At the outset, this seems more of a problem for IDS/IPS/Virus Scan than it does the MVC application.
3) This type of functionality should have already gone through some sort of security review well before it reaches the development team. The risk of uploading files should be pointed out as well as all the compensating controls that are in place to help mitigate that risk. For example, is this page authenticated? Is there sufficient logging? Is there and IDS/IPS in the network? Do the files get virus scanned before being consumed? Can the files be "detonated" on a segregated machine before being given to end users?
As usual, OWASP has a great write-up on this. They cover a lot of the key points you would want to look at when building your application.
First thing is first, you have build your form/app to receive files. This is easily done. MVC4 has the concept of HttpPostedFileBase which is the type the file shows up as. A couple of notes, you must make sure that the name of your input field matches that of the parameter input to your controller method. Secondly, you must set the enctype of the form to multipart/form-data. See this post for more info.
One last thing, the checks below take into account a lot of assumptions about the types of files you want to accept. You will most probably tailor them to your specific example. In my case, I am focusing on docx or xlsx files.
Check 1: Is a file actually there?
Basically this is the equivalent to the ModelState.IsValid() method. You want to make sure you are actually processing a file.
  if (file == null){
   return View("Error");
  }
Check 2: File Size
One of the first checks you are going to want to do is around file size. You can use the request limits feature of request filtering to limit the maximum size you would like to handle. A couple of notes is that this setting is a global setting, and not page specific. Further to this, you cannot set a "minimum" size. You could check for file size in the code (just call length on the input stream), and this might be a valid check in your case.
Check 3: File Name
We have to remember here that the interpreted file name and content type is actually passed directly from the http request. This means that it is untrusted client side input and must be validated. You may want to run a series of checks, but the basics would be length of the file name and a regular expression match. Here you will want to make some assumptions about the types of files/file names you will be getting.
  if (file.FileName.Length > 255)
  {
    return View("Error");
  }
  var regex = new Regex(@"^[\d\w -]+\.[\w]{4}");
  if (!regex.IsMatch(file.FileName))
  {
    return View("Error");
  }
  var acceptedExtensions = new List() {".docx", ".xlsx"};
  var fileExtension = Path.GetExtension(file.FileName);
  if (acceptedExtensions.All(x => !x.Equals(fileExtension)))
  {
    return View("Error");
  }
 
In my case, I am looking specifically for docx and xlsx files. So I can take those assumptions and work with the code.
Check 4: Content Type
Generally, content type is generated from extension present on the file name. This behavior varies between browsers and potentially across OS platforms. Never-the-less, it is information we can look at as a first check. In the code below, I just compare it to a white-list. You could take this farther by keeping a mapping of extension to accepted content type.
var acceptedContentTypes = new List() { "application/vnd.openxmlformats-officedocument.wordprocessingml.document", "application/vnd.openxmlformats-officedocument.spreadsheetml.sheet" }; if (acceptedContentTypes.All(x => !x.Equals(file.ContentType))) { return View("Error"); } 
Check 5: FindMimeFromData Call
Windows has a built in call that does a static check to determine 26 different mime types. The problems with this are that you have to use interop. This is the same check that is done in internet explorer, but the benefit is that server side you can control what parameters you call the program with.
There are a few steps here to using this. First you have to add the call definition to your code.
        [DllImport(@"urlmon.dll", CharSet = CharSet.Auto)]
        private extern static System.UInt32 FindMimeFromData(
            System.UInt32 pBC,
            [MarshalAs(UnmanagedType.LPStr)] System.String pwzUrl,
            [MarshalAs(UnmanagedType.LPArray)] byte[] pBuffer,
            System.UInt32 cbSize,
            [MarshalAs(UnmanagedType.LPStr)] System.String pwzMimeProposed,
            System.UInt32 dwMimeFlags,
            out System.UInt32 ppwzMimeOut,
            System.UInt32 dwReserverd
        );
Next you have to get the header and call the function. This is a rough idea of what you would do.
            var buf = new byte[265];
            file.InputStream.Seek(0, SeekOrigin.Begin);
            if (file.InputStream.Length > 256)
            {
                file.InputStream.Read(buf, 0, 256);
            }
            else
            {
                file.InputStream.Read(buf, 0, (int)file.InputStream.Length);
            }
            System.UInt32 mimetype;
            var result = (int)FindMimeFromData(0, null, buf, 256, null, 0, out mimetype, 0);
            if (result != 0)
            {
                return View("Error");
            }
            var mimeTypePointer = new IntPtr(mimetype);
            var mimeType = Marshal.PtrToStringUni(mimeTypePointer);
            Marshal.FreeCoTaskMem(mimeTypePointer);
            if (mimeType == null || !mimeType.Equals("application/x-zip-compressed"))
            {
                return View("Error");
            }
In the case of open-xml format, the file that actually gets sent is a Zip file.  So you can look for it there.  There are some alternatives to calling out to the interop.  You could figure out the header bits for a zip file and code the check manually.  See mimedetector or filetypedetective.So at this point, let us recap. We have done some basic file name checks. We have also done some basic content type checks, but this data is built from the file name, so it can't be trusted. We have run a sever side check and from the header, we can determine that it is in fact a zip file. Pretty weak!
At this point, we could now rip open the zip file and start to have a look at what is inside.
Check 6: Zip Verification
The code would look something like this.
            var isValid = false;
            using (var archive = new ZipArchive(file.InputStream,ZipArchiveMode.Read))
            {
                foreach (var entry in archive.Entries)
                {
                    if (!entry.Name.Equals("[Content_Types].xml"))
                    {
                        continue;
                    }
                    using (var contentTypeFile = entry.Open())
                    {
                        var xElement = XElement.Load(contentTypeFile);
                        var elements = xElement.Elements();
                        
                        foreach (var element in elements)
                        {
                            if (element.FirstAttribute.Value.StartsWith("/word/") ||
                                element.FirstAttribute.Value.StartsWith("/xl/"))
                            {
                                isValid = true;
                                break;
                            }
                        }
                    }
                }
            }
            if (!isValid)
            {
                return View("Error");
            }
In .net 4.5, they finally introduced the ZipFile/ZipArchive functionality.  You can use it to open up a stream and check what is inside.  All I am doing in the code above looking for a specific file and checking that file for an attribute that I know needs to be there (as per the standard).  If you think about it, all I am doing is verifying that I have received a zip file (that I can open) and it has a file with some key words in it.  Once again, pretty weak stuff!Conclusion
Where do you stop? It really depends on the needs of your application. There is no good way to handle this. We can start to decrease the risk by adding all of these extra checks, but we can't mitigate it completely. The entire solution needs to take file upload into consideration.
 
Secure Storage
ReplyDeleteFile Storage Location: Store uploaded files in a secure location outside of the web root directory to prevent direct access via URLs.
cyber security projects for students
Access Controls: Implement strict access controls and permissions on uploaded files. Ensure only authorized users or