« Back to home

Extracting SQL Server Hashes From master.mdf

Posted on

During a number of engagements, I have found myself in a position in which I have held administrative access to a server running a local instance of Microsoft SQL Server, but had not held credentials to access the service.

For seasoned penetration testers out there, you will know that there are multiple ways to gain access to a SQL Server once you have access to the local server, for example running SQLQuery as the SYSTEM user which often runs under the SysAdmin permission, or injecting into the service process and spawning SSMS.

It always worries me when injecting into a SQL Server process (or any critical process for that matter), especially during a live engagement in which downtime for the service can have real consequences. Up until now I have never had an option to complete a "safe" review of the credentials stored on such a database server.

With this is in mind, I started to research the process of safely recovering password hashes from a SQL Server instance when deployed in "Mixed Authentication Mode", which provides value during an assessment and helps you to potentially access the SQL server via the front door with weak passwords. The obvious target to do this was by reviewing the configuration and metadata that Microsoft SQL Server stores in its master.mdf file. There were a number of challenges in getting this to work, which I have documented below for anyone else interested in implementing or improving on this method.


To begin, we need a basic understanding of the structure of the on-disk database files which are our likely target for password hashes. When listing the DATA directory of a SQL Server install, commonly found in C:\Program Files\Microsoft SQL Server\SQL-VERSION\MSSQL\DATA, it is common to find 2 file types:

  • MDF
  • LDF

For the purposes of this post, we will be exploring the MDF file structure which contains the data we are interested in. For reference, LDF files contain the transaction log associated with a database and will likely be reviewed seperately in a future post.

The MDF file structure is organised into a number of “pages”, which in the case of SQL Server are 8KB in size and consist of a number of elements. As shown within Microsoft’s documentation, a page has the following structure:

Whilst this is a pretty basic overview of a page contents (excluding fun things like NULL bitmaps and index metadata), it gives an idea of the important concepts we will need to understand before being able to grab data from a MDF file.

The page header contains the metadata associated with the body of a page. The work of reversing this header has already been done by Mark Rasmussen in his series of blog posts here, and is worth a read if you are interested in SQL Server forensics, or are just looking to understand its inner workings.

Taken from Mark's blog post, we can see that the header contains a number of fields useful to our cause:

  Bytes    Content
  -----    -------
  00    HeaderVersion (tinyint)
  01    Type (tinyint)
  02    TypeFlagBits (tinyint)
  03    Level (tinyint)
  04-05    FlagBits (smallint)
  06-07    IndexID (smallint)
  08-11    PreviousPageID (int)
  12-13    PreviousFileID (smallint)
  14-15    Pminlen (smallint)
  16-19    NextPageID (int)
  20-21    NextPageFileID (smallint)
  22-23    SlotCnt (smallint)
  24-27    ObjectID (int)
  28-29    FreeCnt (smallint)
  30-31    FreeData (smallint)
  32-35    PageID (int)
  36-37    FileID (smallint)
  38-39    ReservedCnt (smallint)
  40-43    Lsn1 (int)
  44-47    Lsn2 (int)
  48-49    Lsn3 (smallint)
  50-51    XactReserved (smallint)
  52-55    XdesIDPart2 (int)
  56-57    XdesIDPart1 (smallint)
  58-59    GhostRecCnt (smallint)
  60-95    ?

When scanning through pages in the MDF file, we will be looking for the "ObjectID" of our target tables (explored a bit further on), as well as the "Type" flag which will show us the type of data stored within the page.

Data Row

After the page header, we find the bulk of our stored data contained within data rows. There are a number of bytes that we need to understand to allow extraction of data from a table.

The first 2 bytes of a data row contain status bits, which are essentially flags to indicate if variable length fields such as varchar are present within the row.

Next up is the length of fixed sized fields in the row, so these are your char's, int etc.

Then we have the fixed data, so if your row contains two "char(4)" columns, this will contain 8 bytes of data.

Following this we have the "NULL bitmap", which indicates if the column contains a NULL value for this row. So for example if we have 8 columns and they all contain NULL values, this value will be set to 0xFF.

Next we have 2 bytes which contain the number of variable length columns stored in the row.

Then we have the variable column offset array, which is an array of 2 byte entries containing the offset within the row of where each variable length column ends. This allows you to quickly seek to the column that you are looking for in the row by looking up its offset within the array.

Finally we have the variable length-data, which is the actual data contained within those varchar/nvarchar fields.

If you would like to understand the inner workings of these structures, I highly recommend the book "Microsoft SQL Server 2008 Internals", which contains a lot of information on status bits and values contained wtihin the data row structure.

For our purposes, we are going to let a library take care of the parsing and navigation of the data row structure which I will show below.

User/Hash Storage in SQL Server

Now for the interesting stuff - how and where are password hashes stored in SQL Server. As any security minded SQL Server user will know, password hashes can be retrieved with the following query:

SELECT name, password_hash FROM sys.sql_logins

This query returns both the username and password hash of a SQL Server login and it is this data that we need to find within the MDF file. Let’s take a look at what kind of an object "sys.sql_logins" is:

SELECT * FROM sys.sysobjects WHERE id = OBJECT_ID(‘sys.sql_logins’)

It seems that this is a view, which means that before we can look to extract the underlying data from the file, we need to find out which table holds the actual username and hash. We can do this by using “sp_helptext” to find the source of the view data:

exec sp_helptext [sys.sql_logins]

Which gives us the following:

CREATE VIEW sys.sql_logins AS
  SELECT p.name,
    p.id AS principal_id,
    p.sid, p.type,
    n.name AS type_desc,
    sysconv(bit, p.status & 0x80) AS is_disabled,
    p.crdate AS create_date,
    p.modate AS modify_date,
    p.dbname AS default_database_name,
    p.lang AS default_language_name,
    r.indepid AS credential_id,
    sysconv(bit, p.status & 0x10000) AS is_policy_checked,
    sysconv(bit, p.status & 0x20000) AS is_expiration_checked,
    convert(varbinary(256), LoginProperty(p.name, 'PasswordHash')) AS password_hash
    FROM master.sys.sysxlgns p
    LEFT JOIN sys.syssingleobjrefs r ON r.depid = p.id AND r.class = 63 AND r.depsubid = 0    -- SRC_LOGIN_CREDENTIAL
    LEFT JOIN sys.syspalnames n ON n.class = 'LGTY' AND n.value = p.type
    WHERE type = 'S'
    AND has_access('LG', id) = 1

This shows that our data is held within a system table of sys.sqlxlgns. To verify, lets try and select the data:

SELECT * FROM master.sys.sysxlgns

So apparently SQL Server is unable to retrieve this table when logged in as a sysadmin user, which is strange. After a bit of digging it turns out that this table can only be queried using a dedicated admin connection (DAC). To access SQL Server via the DAC, we can use SQLQuery.exe with the ‘-A’ option or SSMS with the ADMIN prefix before the server name. A good walkthrough of the DAC can be found here.

So now we are connected using a DAC, lets try again:

SELECT * FROM sys.sysxlgns

And there we go, our list of users and password hashes. Next up, we need to find this table in the master.mdf file.

Locating objects via sys.sysschobjs

When parsing the raw data of a MDF file, as discussed above, a page contains the rows of data associated with a table. Within the header of a page, there is an ObjectID, but unfortunately for us there doesn’t seem to be a reference to the table name which means that we need a way to translate the ObjectID to a table name without the aid of SQL Server.

To do this, we can use the sys.sysschobjs table, which is an internal system table (https://msdn.microsoft.com/en-us/library/ms179503.aspx).

On SQL Server 2012, the sys.sysschobjs table has the following schema:

id      - int
name    - sysname
nsid    - int
nsclass - tinyint
status  - int
type    - char
pid     - int
pclass  - tinyint
intprop - int
created - datetime
modified - datetime
status2 - int

Of these fields, we are interested in the ‘id’ and ‘name’ fields for locating the sysxlgns table ObjectID. This table is another example of an internal system table which only seems accessible via a DAC, so lets query it to ensure that our sys.sysxlgns table is present and accessible:

SELECT * FROM sys.sysschobjs WHERE id = OBJECT_ID(‘sys.sysxlgns’)

Great, so we have an ObjectID of “42” for the sysxlgns table.

Now all we need to do is find the sys.sysschobjs page in the MDF file and we're all set. Lucky for us, it seems that the sys.sysschobjs ObjectID is set to "34" in versions of SQL Server 2008 onwards. This may change in future versions (or service packs), but for each version I've seen, and supporting online documentation, this ObjectID seems to be fairly consistent.

Reading master.mdf

OK, so we've explored the MDF file and have a good idea on how we can extract data from its contents, all that is left is to actually retrieve the file.

Attempts to copy the master.mdf file from a running database server will fail as SQL Server locks the file during operation. As I wanted a safe way to dump hashes without interupting the operation of the server, I needed to find a way to access master.mdf with SQL Server operating.

There are a couple of ways to do this, but I settled on the awesome PowerSploit tool Invoke-NinjaCopy. This tool works by parsing the NTFS file system and extracting the raw data from the disk, meaning DACLS and locks held on files are bypassed. Of course you need to be an Administrator to do this.

With that in mind, we can now see our process of extracting hashes becomes:

  • Locate the master.mdf file
  • Retrieve a copy of the file via Invoke-NinjaCopy
  • Scan through the pages of the MDF file, searching for sys.sysschobjs with an ObjectID of 34
  • Parse the sys.sysschobjs table to find the sys.sysxlgns objectID
  • Locate the sys.sysxlgns pages
  • Extract the data from the contained rows


At first, I was looking to extract data by reading in the MDF file and parsing the headers / pages manually until I accessed the data that I needed.

At this point I was told about OrcaMDF, a .NET library which allows offline parsing of MDF files. Specifically, we are interested in the “RawDatabase” functionality which gives us low-level access to pages and data rows without having to create the disk IO methods.

Using this library, I have created Get-MDFHashes, a very rough POC Powershell script which takes an MDF file as input (likely after after being NinjaCopy'd from the server), and outputs the usernames and password hashes. For example:

Once you have retrieved the hashes, you can dump the hashes into JTR and cross your fingers ;)