Wednesday, February 9, 2011

One to One database relation?

In my free time I started writing a small multiplayer game with a database backend. I was looking to separate player login information from other in game information (inventory, stats, and status) and a friend brought up this might not be the best idea.

Would it be better to lump everything together in one table?

  • Probably as well to put it in one table in this situation as your query performance will be better.

    You really only want to use the next table when there will be an element of duplicate data by which you should look to normalization to reduce your data storage overheads.

    From Campbell
  • Inventory is a many-to-one relationship anyway, so probably deserves its own table (indexed on its foreign key at least).

    You may also want to have personal stuff for each user separate from public stuff (i.e., what others can see of you). That may provide better cache-hits, since many people will see your public attributes, but only you see your personal attributes.

    From Tanktalus
  • Relational databases work on sets, a database query is there to deliver none ("not found") or more results. A relational database is not intended to deliver always exactly one result by executing a query (on relations).

    Said that I want to mention that there is real life, too ;-) . A one-to-one relation might make sense, i.e. if the column count of the table reaches a critical level it might be faster to split this table. But these kind of conditions really are rare.

    If you do not really need to split the table just don't do it. At least I assume that in your case you would have to do a select over two tables to get all data. This probably is slower than storing all in one table. And your programming logic will at least get a little bit less readable by doing so.

    Edith says: Commenting on normalization: Well, I never saw a database normalized to it's max and no-one is really recommending that as an option. Anyway, if you do not exactly know if any columns will be normalized (i.e. put in other tables) later on, then it too is easier to do this with one table instead of "table"-one-to-one-"other table"

    Joe Philllips : One nice thing about keeping it separated is that you can add/remove fields if you ever want to change how your database is set up. It's more difficult to do this if the fields are all built into the same table.
    Georgi : I doubt this. Why should it be easier? "ALTER TABLE x DROP COLUMN y" is the same on both tables. And updating your program is the same, too. Where's the bar-gain?
    Matt Rogish : "A relational database is not intended to deliver always exactly one result by executing a query." <- this does not make sense? "SELECT * FROM table WHERE primary_key = something" will always return one row (if it exists). Set theory doesn't care if the set contains 0, 1, or m tuples.
    Georgi : Let me explain: A relational database is not intended to deliver always exactly one result by executing a query "on relations".
    From Georgi
  • A 1-1 relationship is just a vertical split. Nothing else. do it if you for some reason you won't store all the data in the same table (say for partitioning across multiple servers etc)

    mjard : Currently it'd just be running on one server. Needing to scale is pretty far in the future.
    From Midhat
  • I would suggest that you keep each record type in its own table.

    Player Logins are one type of record. Inventory is another. They should not share a table as most of the information isn't shared. Keep your tables simple by separating out unrelated information.

    From Brody
  • As you can see, the general consensus on this is that "it depends". Performance/query optimisation easily comes to mind - as mentioned by @Campbell and others.

    But I think for the kind of application-specific database you are building, it is best to start by considering the overall application design. For application-specific databases (as opposed to databases designed to be used with many apps or reporting tools), the 'ideal' data model is probably the one that best maps to your domain model as implemented in your application code.

    You may not call your design MVC, and I don't know what language you are using, but I expect you have some code or classes that wraps the data access with a logical model. How you view the application design at this level is I think the best indicator of how your database should ideally be structured.

    In general, this means a one-to-one mapping of domain objects to database tables is the best place to start.

    I think best to illustrate what I mean by example:

    Case 1: One Class / One Table

    You think in terms of a Player class. Player.authenticate, Player.logged_in?, Player.status, Player.hi_score, Player.gold_credits. As long as all these are actions on a single user, or represent single-value attributes of a user, then a single table should be your starting point from a logical application design perspective. Single class (Player), single table (players).

    Case 2: Multiple Class / One or more tables

    Maybe I don't like the swiss-army-knife Player class design, but prefer to think in terms of User, Inventory, Scoreboard and Account. User.authenticate, User.logged_in?, User->account.status, Scoreboard.hi_score(User), User->account.gold_credits.

    I'm probably doing this because I think separating these concepts in the domain model is going to make my code clearer and easier to write. In this case, the best starting point is a direct mapping of the domain classes to individual tables: Users, Inventory, Scores, Accounts etc.

    Making the 'Ideal' Practical

    I slipped some words in above to indicate that the one-one mapping of domain objects to tables is the ideal, logical design.

    That is my preferred starting point. Trying to be too smart off the bat - like mapping multiple classes back to a single table - tends to just invite trouble I'd prefer to avoid (deadlocks and concurrency issues especially).

    But it's not always the end of the story. There are practical considerations that can mean a separate activity to optimize the physical data model is needed e.g. concurrency/locking issues? row size getting too large? indexing overhead? query performance etc.

    Be careful when thinking about performance though: just because it may be one row in one table, probably doesn't mean it is queried just once to get the whole row. I imagine the multiuser game scenario may involve a whole series of calls to get different player information at different times, and the player always want the 'current value' which may be quite dynamic. That may translate into many calls for just a few columns in the table each time (in which case, a multiple-table design could be faster than a single-table design).

    From tardate
  • Start by ignoring performance, and just create the most logical and easy-to-understand database design that you can. If players and game objects (swords? chess pieces?) are separate things conceptually, then put them in separate tables. If a player can carry things, you put a foreign key in the "things" table that references the "players" table. And so on.

    Then, when you have hundreds of players and thousands of things, and the players run around in the game and do things that require database searches, well, your design will still be fast enough, if you just add the appropriate indexes.

    Of course, if you plan for thousands of simultaneous players, each of them inventorying his things every second, or perhaps some other enormous load of database searches (a search for each frame rendered by the graphics engine?) then you will need to think about performance. But in that case, a typical disk-based relational database will not be fast enough anyway, no matter how you design your tables.

  • Whether to keep

    player login information [separate] from other in game information (inventory, stats, and status)

    is often a question of normalization. You may want to take a quick look at the wikipedia (Third Normal Form, Denormalization) definitions. Whether you separate these into multiple tables depends on what data is stored. Expanding your question will make this concrete. Data that we want to store is:

    • username: unique name that allows a user to login to the system
    • passwd: password: associated with the username that guarantees the user is unique
    • email: email address for the user; so the game can "poke" the user when they haven't played recently
    • character: possibly separate ingame name for the character
    • status: is the player and/or character currently active
    • hitpoints: an example stat for the character

    We could store this as a single table or as multiple tables.

    Single Table Example

    If players have a single character in this game world, then a single table may be appropriate. For example,

    CREATE TABLE players(
      id         INTEGER NOT NULL,
      username   VARCHAR2(32) NOT NULL,
      password   VARCHAR2(32) NOT NULL,
      email      VARCHAR2(160),
      character  VARCHAR2(32) NOT NULL,
      status     VARCHAR2(32),
      hitpoints  INTEGER,
    
      CONSTRAINT pk_players PRIMARY KEY (uid)
    );
    

    This would allow you to find all of the pertinent information about a player with a single query on the players table.

    Multiple Table Example

    However, if you wanted to allow a single player to have multiple different characters you would want to split this data across two different tables. For example you might do the following:

    -- track information on the player
    CREATE TABLE players(
      id         INTEGER NOT NULL,
      username   VARCHAR2(32) NOT NULL,
      password   VARCHAR2(32) NOT NULL,
      email      VARCHAR2(160),
    
      CONSTRAINT pk_players PRIMARY KEY (id)
    );
    
    -- track information on the character
    CREATE TABLE characters(
      id         INTEGER NOT NULL,
      player_id  INTEGER NOT NULL,
      character  VARCHAR2(32) NOT NULL,
      status     VARCHAR2(32),
      hitpoints  INTEGER,
    
      CONSTRAINT pk_characters PRIMARY KEY (id)
    );
    -- foreign key reference
    ALTER TABLE characters
      ADD CONSTRAINT characters__players
      FOREIGN KEY (player_id) REFERENCES players (id);
    

    This format does require joins when looking at information for both the player and character; however, it makes updating records less likely to result in inconsistency. This latter argument is typically used when advocating for multiple tables. For example, if you had a player (LotRFan) with two characters (gollum, frodo) in the single table format you would have the username, password, email fields duplicated. If the player wanted to change his email/password, you would need to modify both records. If only one record was modified the table will become inconsistent. However, if LotRFan wanted to change his password in the multiple table format a single row in the table is updated.

    Other Considerations

    Whether to use a single table or multiple tables depends on the underlying data. What hasn't been described here but was noted in other answers is that optimizations often will take two tables and combine them into a single table.

    Also of interest is knowing the types of changes that will be made. For example, if you were to choose to use a single table for players who might have multiple characters and wanted to keep track of total number of commands issued by the player by incrementing a field in the players table; this would result in more rows being updated in the single table example.

    From terson
  • I recommend separating them. Not for any database reasons, but for development reasons.

    When you're coding your player login module, you don't want to think about any of the game information. And visa-versa. It will be easier to maintain a separation of concerns if the tables are distinct.

    It boils down to the fact that your game information module ain't got no business messin' around with the player login table.

  • i agree with Thomas Padron-McCarthy's answer:

    Solving for the case of thousands of simultaneous users is not your problem. Getting people to play your game is the problem. Start with the simplest design - get the game done, working, playable, and easily expandable. If the game takes off, then you de-normalize.

    It's also worth mentioning that indexes can be looked at as separate tables that contain a narrower view of the data.

    People have been able to infer a design used by Blizzard for World of Warcraft. It seems that the quests a player is on is stored with the character. e.g.:

    CREATE TABLE Players (
       PlayerID int,
       PlayerName varchar(50),
       ...
       Quest1ID int,
       Quest2ID int,
       Quest3ID int,
       ...
       Quest10ID int,
       ...
    )
    

    Initially you could be on at most 10 quests, later it was expanded to 25, e.g.:

    CREATE TABLE Players (
       PlayerID int,
       PlayerName varchar(50),
       ...
       Quest1ID int,
       Quest2ID int,
       Quest3ID int,
       ...
       Quest25ID int,
       ...
    )
    

    This is in contrast to a split design:

    CREATE TABLE Players (
       PlayerID int,
       PlayerName varchar(50),
       ...
    )
    
    CREATE TABLE PlayerQuests (
       PlayerID int,
       QuestID int
    )
    

    The latter is much easier conceptually to deal with, and it allows the arbitrary number of quests. On the other hand you have to join Players and PlayerQuests in order to get all that information.

    From Ian Boyd

0 comments:

Post a Comment