UNITED STATES DISTRICT COURT SOUTHERN DISTRICT OF NEW YORK
February 7, 2011
NATIONAL DAY LABORER ORGANIZING NETWORK, ET AL., PLAINTIFFS,
UNITED STATES IMMIGRATION AND CUSTOMS ENFORCEMENT AGENCY, ET AL., DEFENDANTS.
The opinion of the court was delivered by: Shira A. Scheindlin, U.S.D.J.:
OPINION AND ORDER
Plaintiffs brought this action for the purpose of obtaining records, pursuant to the Freedom of Information Act ("FOIA"), from four government agencies (collectively, "Defendants"). Specifically, the requests pertain to Secure Communities, a collaborative program established by the United States Immigration and Customs Enforcement Agency ("ICE") and the Department of Justice ("DOJ") that enlists states and localities in the enforcement of federal immigration law.*fn1 A dispute has now arisen regarding the format in which the Defendants have produced records to Plaintiffs, and will be required to produce records to Plaintiffs in the future. Those records consist of electronic text records, e-mails, spreadsheets, and paper records. To set the stage, I note that generally speaking records can be produced in hard copy, static images (with or without load files) and native file format (with or without load files).
In February 2010, Plaintiffs submitted identical twenty-one page FOIA requests to each of the four defendant agencies.*fn2 Defendants claim that these requests would require production of millions of pages of responsive documents. Because the Plaintiffs received no substantive response to their requests, on April 27, 2010, they brought this suit to compel production of responsive records. After negotiating with the Government, Plaintiffs agreed to create a five-page Rapid Production List ("RPL") identifying specific records that would be sought and hopefully produced on an expedited basis. The Government believes that even responses to the RPL will involve thousands of pages of records.
After further negotiations, the parties reached an agreement on July 7, 2010, regarding production responsive to the RPL. In substance, the Defendants agreed to produce "the bulk of responsive, non-exempt materials by Friday, July 30."*fn3 The agreement also provided that if the Defendants identified responsive, non-exempt materials that could not be produced by that date, they would provide Plaintiffs with a description of such materials by July 26, and would propose an alternative date for their production. Defendants failed to produce any records by the agreed-upon July 30 date, but nearly two thousand pages of records were produced on August 3, August 13, September 8, and October 22, 2010.*fn4 These productions did not satisfy the July 7 agreement.
On October 22, 2010, Plaintiffs moved for a preliminary injunction to compel production for five categories of the RPL documents that had not been produced. Specifically, Plaintiffs asked the Court to order (1) that Opt-Out records -- defined in the RPL as "National policy memoranda, legal memoranda or communications relating to the ability of states or localities to opt-out or limit their participation in [the program]" -- be produced within five days; (2) that Defendants provide a Vaughn index within ten days;*fn5 and (3) that an expedited briefing schedule be set for contested exemptions. The motion was resolved at a conference held on December 9, 2010, with an order requiring Defendants to provide the Opt-Out Records by January 17, 2011.*fn6
On December 22, 2010, Plaintiffs sent the Government a Proposed Protocol Governing the Production of Records ("Proposed Protocol"). This proposal, annexed hereto as Exhibit A, sets forth a requested format for the production of electronic records and a separate requested format for the production of paper records. As Plaintiffs note, the Proposed Protocol is based, in part, on the format demands routinely made by two government entities -- the Securities and Exchange Commission and the Department of Justice Criminal Division.
In advance of a court conference scheduled for January 12, 2011, Defendants produced five PDF files totaling less than three thousand pages. Upon receipt of these files, Plaintiffs again sought assistance from the Court, asserting that the form in which these records were produced was unusable.*fn7 Plaintiffs made three specific complaints: (1) the data was produced in an unsearchable PDF format; (2) electronic records were stripped of all metadata; and (3) paper and electronic records were indiscriminately merged together in one PDF file. Plaintiffs asked the Court to "so order" the Proposed Protocol.*fn8 In response, the Government submitted a letter defending its form of production.*fn9 An oral argument on this issue was held on January 12, 2011.*fn10
Before turning to a discussion of the issues raised by this dispute, it is important to describe what the parties did and did not do in an effort to negotiate an agreed upon form of production. As far as I can tell from the record submitted by the parties, the equivalent of a Rule 26(f) conference, at which the parties are required to discuss form of production, was not held and no agreement regarding form of production was ever reached. Nor was a dispute regarding form of production brought to the Court for resolution. The Proposed Protocol was first provided to Defendants on December 22, 2010, and also was the first time Plaintiffs made a written demand for load files and metadata fields.*fn11 Prior to December 22, the only written specification of form of production was a July 23 e-mail from Bridget Kessler, Plaintiffs' counsel, to AUSA Connolly, Defendants' counsel. Given its importance and brevity, I quote the full text of this e-mail:
We would appreciate if you could let us know as soon as possible how ICE plans to produce the Rapid Production List to plaintiffs. To facilitate review of the documents between several offices, please (1) produce the responsive records on a CD and, if possible, as an attachment to an email; (2) save each document on the CD as a separate file; (3) provide excel documents in excel file format and not as PDF screen shots; and (4) produce all documents with consecutively numbered bate [sic] stamps. . . . Thank you for your help and if you have any questions or concerns, please feel free to call me.
It is undisputed that Defendants' counsel did not respond to the e-mail by raising any questions or concerns. Defendants do not deny that the records that have been produced, including but not limited to spreadsheets, are in an unsearchable PDF format with no metadata.
III. APPLICABLE LAW
A. FOIA and the Federal Rules of Civil Procedure
FOIA provides that "[i]n making any record available to a person under this paragraph, an agency shall provide the record in any form or format requested by the person if the record is readily reproducible by the agency in that form or format."*fn12
While Congress has recognized the need for "Government agencies [to] use new technology to enhance public access to agency records and information," there is surprisingly little case law defining this standard.*fn13 The leading case, Sample v. Bureau of Prisons, provides the following guidance:
Under any reading of the statute, however, "readily reproducible" simply refers to an agency's technical capability to create the records in a particular format. No case construing the language focuses on the characteristics of the requester. See, e.g., TPS, Inc. v. U.S. Dep't of Defense, 330 F.3d 1191, 1195 (9th Cir. 2003) (interpreting "readily reproducible" as referring to technical capability); see also, e,g., Carlson v. U.S. Postal Serv., 2005 WL 756573, at *7 (N.D. Cal. 2005) (holding that "readily reproducible" in a requested format means "readily accessible" by the agency in that format); Landmark Legal Found. v. EPA, 272 F. Supp. 2d 59, 63 (D.D.C. 2003 (construing "readily reproducible" as the ability to duplicate).*fn14
Rule 34 of the Federal Rules of Civil Procedure also addresses the
form of production of records, albeit in the context of discovery. The
Rule is divided into a series of steps that are intended to facilitate
production in a useful format. First, the requesting party may specify
the form of production of electronically stored information
("ESI").*fn15 Second, the responding party may object
to the specified form; if it does so, it must state the form that it
intends to use.*fn16 If the requesting party disagrees
with the counter-proposal, the parties must attempt to resolve the
disagreement. If they cannot, the requesting party may make a motion
to compel production in the requested form. Third, if the requesting
party has not specified a form of production, the responding party
must state the form that it intends to use.*fn17 The
responding party may select the form in which the material "is
ordinarily maintained," or in a "reasonably usable form."*fn18
The Advisory Committee Note to Rule 34 states that the
responding party's "option to
produce [ESI] in a reasonably usable form does not mean that [it] is
free to convert [ESI] from the form in which it is ordinarily
maintained to a different form that makes it more difficult or
burdensome for the requesting party to use the information
efficiently."*fn19 Finally, the Advisory Committee
Note also states that if the ESI is kept in an
electronically-searchable form, it "should not be produced in a form
that removes or significantly degrades this feature."*fn20
B. Case Law
1. Metadata and Load Files
In Aguilar v. Immigration and Customs Enforcement Division of the United States Department of Homeland Security, Magistrate Judge Frank Maas, of this District, provided a guidebook that explained the various types of metadata and the relationship between a record and its metadata.*fn21 In that opinion, Judge Maas noted that in the second edition of the Sedona Principles, the Conference abandoned an earlier presumption against the production of metadata in recognition of "'the need to produce reasonably accessible metadata that will enable the receiving party to have the same ability to access, search, and display the information as the producing party . . . .'"*fn22 By now, it is well accepted, if not indisputable, that metadata is generally considered to be an integral part of an electronic record.*fn23
The Aguilar decision also explained the term load file, quoting from The Sedona Conference Glossary. The 2010 version of the Glossary now defines the term as follows:
A file that relates to a set of scanned images of electronically processed files, and indicates where individual pages or files belong together as documents, to include attachments, and where each document begins and ends. A load file may also contain data relevant to the individual documents, such as selected metadata, coded data, and extracted texts. Load files should be obtained and provided in prearranged or standardized formats to ensure transfer of accurate and usable images and data.*fn24
Once again, it is by now well accepted that when a collection of static images are produced, load files must also be produced in order to make the production searchable and therefore reasonably usable.*fn25
2. FOIA and Metadata
No federal court has yet recognized that metadata is part of a public record as defined in FOIA. However, this precise issue has been addressed by several state courts, which have uniformly held, in the context of state freedom of information laws, that metadata is indeed a part of public records and must be disclosed pursuant to a request for public records.*fn26 In his January 6, 2011 letter, Plaintiffs' counsel does cite one FOIA case recognizing that production in a medium which detrimentally affects the access to the information sought is inappropriate because it could improperly "reduce the quantum of information made available."*fn27
A. August - October 2010 and January 2011 Productions
The Government defends the format of the productions to date based on its claim that Plaintiffs failed to make a timely request for metadata. Placing heavy reliance on Aguilar, Defendants quote the following language: "[I]f a party wants metadata, it should 'Ask for it. Up front. Otherwise, if [the party] asks[s] too late or ha[s] already received the document in another form, [it] may be out of luck.'"*fn28 Given Plaintiffs' July 23 e-mail and Defendants' tardy productions, I cannot accept this lame excuse for failing to produce the records in a usable format.
First, the language of Plaintiffs' July23 e-mail, while less than crystal clear, was sufficient to put Defendants on notice of certain requests regarding form of production. Defendants were asked to "(2) save each document on the CD as a separate file; (3) provide excel documents in excel file format and not as PDF screen shots; and (4) produce all documents with consecutively numbered bate [sic] stamps." These requests explicitly placed Defendants on notice that spreadsheets were sought in native format - not as a PDF screen shot - and that each text record should be produced as a separate file (i.e., in single file format). In addition, the request for consecutively numbered Bates stamping also put Defendants on notice of the need for single file format. Second, and of equal if not greater importance, Plaintiffs asked the Government to "let us know as soon as possible how ICE plans to produce the Rapid Production List." Had the Government done as it was asked, any ambiguity as to the nature of the requested format would have been resolved. Finally, Plaintiffs wrote "if you have any questions or concerns, please feel free to call me." This invitation was ignored.
Defendants violated the explicit requests of the July 23 e-mail by producing all of the records in non-searchable PDF format, merging all records without indicating any separate files, merging paper with electronic records, and failing to produce e-mails with attachments. They also violated the Federal Rules of Civil Procedure (the "Rules") by failing to produce the records in a reasonably usable form, and by producing the records in a form that makes it difficult or burdensome for the requesting party to use the information efficiently.
The Government argues that metadata is substantive information that must be explicitly requested and then reviewed by an agency for possible exemptions.*fn29 Because there is no controlling FOIA precedent recognizing that metadata is an integral part of the electronic record that must be produced when an electronic record is requested, the Government asserts that it complied with its FOIA obligations, even if it did not comply with the Rules.*fn30 To that end, the Government argues that if the requirements of FOIA and the requirements of the Rules conflict, FOIA must trump the Rules.*fn31
However, there is no need to decide this question because FOIA does not conflict with the Rules. FOIA is silent with respect to form of production, requiring only that the record be provided in "any form or format requested by the person if the record is readily reproducible by the agency in that form or format." There is no doubt in my mind that this language refers only to technical ability or, at most, reasonable accessibility. Defendants do not argue that they are unable to produce the records in the requested form - namely native format for spreadsheets and single file format for text records - but that reviewing all of the metadata would greatly increase the burden of search and production. To that extent, they have unwittingly argued that a request to produce all metadata would push the request into the second tier of Rule 26(b)(2)(B) because such records are not reasonably accessible based on undue burden and cost.*fn32
Nonetheless, the Government argues that FOIA is not synonymous with discovery in a civil litigation. It is a statute requiring the production of records to the public, upon request, subject to certain exemptions. While rhetorically nuanced, this argument is unavailing. Regardless of whether FOIA requests are subject to the same rules governing discovery requests, Rule 34 surely should inform highly experienced litigators as to what is expected of them when making a document production in the twenty-first century.*fn33 As noted earlier, Defendants' productions to date have failed to comply with Rule 34 or with FOIA.
The next issue to address is the appropriate remedy. Because no
metadata was specifically requested in Plaintiffs' July 23 e-mail, and
because this is an issue of first impression, I will not require
Defendants to re-produce all of the records with metadata. Moreover,
while native format is often the best form of production, it is easy
to see why it is not feasible where a significant amount of
information must be redacted.*fn34 Therefore,
Defendants are ordered to re-produce
all text records in static image single file format together with
their attachments. However, they must re-produce all spreadsheets in
native format as requested by Plaintiffs' July 23 e-mail.*fn35
All records must be Bates stamped, which should also assist
in the production of single file format.*fn36
That said, I now hold, consistent with the state court decisions cited earlier, that certain metadata is an integral or intrinsic part of an electronic record.*fn37
As a result, such metadata is "readily reproducible" in the FOIA context. The only remaining issue is which of the many types of metadata are an intrinsic part of an electronic record. Unfortunately, there is no ready answer to this question. The answer depends, in part, on the type of electronic record at issue (i.e., text record, e-mail, or spreadsheet) and on how the agency maintains its records. Some agencies may maintain only a printed or imaged document as the final or official version of a record. Others retain all records in native format, which preserves much of the metadata. Electronic records may have migrated from one system to another, maintaining some metadata but not all. The best way I can answer the question is that metadata maintained by the agency as a part of an electronic record is presumptively producible under FOIA, unless the agency demonstrates that such metadata is not "readily reproducible."
B. Future Productions
The Government argues that the Proposed Protocol should not be required for the January 17 production of the Opt-out Records because it was received after the Government had completed the great bulk of its search.*fn38 While it surely had not reviewed all of the documents as of December 22, there is no reason to question the Government's claim that it began to search in earnest on December 10, following a court conference on December 9 which set a production deadline of January 17, and that it had completed the bulk of its collection efforts by December 22.*fn39 The Government notes that the form of production issue was not raised at the December 9 conference or at any time prior to December 22. Given this time line, the January 17 production shall be made (or re-made if already completed) in the same format that I have now required for the earlier productions.
I turn now to all future productions. Here, Plaintiffs ask that the bulk of the ESI be produced in TIFF image format but with corresponding load files, Bates stamping, and the preservation of "parent-child" relationships (i.e. the association between an attachment and its parent record). Plaintiffs also request twenty-four specific fields of metadata, which presumably will be the content of the load files. Finally, Plaintiffs request that spreadsheets be produced in both native format and TIFF format. Hard copy records are requested in single page TIFF image format with corresponding load files to provide ease of review.
The Government has not made a counterproposal in response to Plaintiffs' Proposed Protocol. Nonetheless, the Court will not impose any greater burden on the Defendants than is absolutely necessary to conduct an efficient review. After reviewing the Proposed Protocol as well as various sources discussing essential metadata fields,*fn40 I conclude that all future productions by Defendants must include load files that contain the following fields, which apply to all forms of ESI.*fn41
1. Identifier: A unique production identifier ("UPI") of the item.*fn42
2. File Name: The original name of the item or file when collected from the source custodian or system.
3. Custodian: The name of the custodian or source system from which the item was collected.
4. Source Device: The device from which the item was collected.
5. Source Path: The file path from the location from which the item was collected.
6. Production Path: The file path to the item produced from the production media.
7. Modified Date: The last modified date of the item when collected from the source custodian or system.
8. Modified Time: The last modified time of the item when collected from the source custodian or system.
9. Time Offset Value: The universal time*fn43 offset of the item's modified date and time based on the source system's time zone and daylight savings time settings.
The following additional fields shall accompany production of all e-mail messages:
1. To: Addressee(s) of the message.
2. From: The e-mail address of the person sending the message.
3. CC: Person(s) copied on the message.
4. BCC: Person(s) blind copied on the message.
5. Date Sent: Date the message was sent.
6. Time Sent: Time the message was sent.
7. Subject: Subject line of the message.
8. Date Received: Date the message was received.
9. Time Received: Time the message was received.
10. Attachments: The Bates number ranges of e-mail attachments. The parties may alternatively choose to use: Bates_Begin, Bates_End, Attach_Begin and Attach_End.
The following additional fields shall accompany images of paper records:
1. Bates_Begin: The beginning Bates number or UPI for the first page of the document.
2. Bates_End: The ending Bates number or UPI for the last page of the document.
3. Attach_Begin: The Bates number or UPI of the first page of the first attachment to the parent document; and
4. Attach_End: The Bates number or UPI of the last page of the last attachment to the parent document.
In addition, Defendants must produce spreadsheets in native format, with accompanying load files if the required metadata is not preserved in the native file. However, unless Plaintiffs can demonstrate why it is also necessary to produce the spreadsheets in TIFF format, Defendants need not make such a production. Conversely, the Government may produce the spreadsheets in TIFF format with load files containing the applicable metadata fields, if it can demonstrate why native production of spreadsheets would inevitably reveal exempt information.
Although Plaintiffs requested certain additional fields -- namely Parent Folder; File Size; File Extension; Record Type; Master_Date; and Author -- I conclude that these fields need not be produced in this case. Except as noted in this Opinion, all of the other format requirements in the Proposed Protocol with respect to both ESI and hard copy are hereby ordered.*fn44
One final note. Whether or not metadata has been specifically requested - which it should be - production of a collection of static images without any means of permitting the use of electronic search tools is an inappropriate downgrading of the ESI. That is why the Government's previous production - namely, static images stripped of all metadata and lumped together without any indication of where a record begins and ends - was not an acceptable form of production. The Government would not tolerate such a production when it is a receiving party, and it should not be permitted to make such a production when it is a producing party. Thus, it is no longer acceptable for any party, including the Government, to produce a significant collection of static images of ESI without accompanying load files.*fn45
Once again, this Court is required to rule on an e-discovery issue that could have been avoided had the parties had the good sense to "meet and confer," "cooperate" and generally make every effort to "communicate" as to the form in which ESI would be produced. The quoted words are found in opinion after opinion and yet lawyers fail to take the necessary steps to fulfill their obligations to each other and to the court. While certainly not rising to the level of a breach of an ethical obligation, such conduct certainly shows that all lawyers - even highly respected private lawyers, Government lawyers, and professors of law - need to make greater efforts to comply with the expectations that courts now demand of counsel with respect to expensive and time-consuming document production. Lawyers are all too ready to point the finger at the courts and the Rules for increasing the expense of litigation, but that expense could be greatly diminished if lawyers met their own obligations to ensure that document production is handled as expeditiously and inexpensively as possible. This can only be achieved through cooperation and communication.