Friday, February 4, 2011

Storing Images in PostgreSQL

Alright, so I'm working on an application which will use a Linux back-end running PostgreSQL to serve up images to a Windows box with the front end written in C#.NET, though the front-end should hardly matter. My question is:

  • What is the best way to deal with storing images in Postgres?

The images are around 4-6 megapixels each, and we're storing upwards of 3000. It might also be good to note: this is not a web application, there will at most be about two front-ends accessing the database at once.

  • In the database, there are two options:

    • bytea. Stores the data in a column, exported as part of a backup. Uses standard database functions to save and retrieve. Recommended for your needs.
    • blobs. Stores the data externally, not normally exported as part of a backup. Requires special database functions to save and retrieve.

    I've used bytea columns with great success in the past storing 10+gb of images with thousands of rows. PG's TOAST functionality pretty much negates any advantage that blobs have. You'll need to include metadata columns in either case for filename, content-type, dimensions, etc.

    From jcoby
  • Try this. I've use the Large Object Binary (LOB) format for storing generated PDF documents, some of which were 10+ MB in size, in a database and it worked wonderfully.

  • Don't store in images in the database unless you absolutely have to. I understand that this is not a web application, but if there isn't a shared file location that you can point to save the location of the file in the database.

    //linuxserver/images/imagexxx.jpg
    

    then perhaps you can quickly set up a webserver and store the web urls in the database (as well as the local path). While databases can handle LOB's and 3000 images (4-6 Megapixels, assuming 500K an image) 1.5 Gigs isn't a lot of space file systems are much better designed for storing large files than a database is.

  • I agree with what Kris mentioned in his post.

    You can see a similar discussion here: Storing images in db

    From jwolly2
  • Re jcoby's answer:

    bytea being a "normal" column also means the value being read completely into memory when you fetch it. Blobs, in contrast, you can stream into stdout. That helps in reducing the server memory footprint. Especially, when you store 4-6 MPix images.

    No problem with backing up blobs. pg_dump provides "-b" option to include the large objects into the backup.

    So, I prefer using pg_lo_*, you may guess.

    Re Kris Erickson's answer:

    I'd say the opposite :). When images are not the only data you store, don't store them on the file system unless you absolutely have to. It's such a benefit to be always sure about your data consistency, and to have the data "in one piece" (the DB). BTW, PostgreSQL is great in preserving consistency.

    However, true, reality is often too performance-demanding ;-), and it pushes you to serve the binary files from the file system. But even then I tend to use the DB as the "master" storage for binaries, with all the other relations consistently linked, while providing some file system-based caching mechanism for performance optimization.

0 comments:

Post a Comment