Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Thanks for the explanation.

Quite frankly, no application I have ever worked on has had to deal with "huge" amounts of data by any common definition (a couple of gigabytes at the most).

And like I did say, looking at our ERP system's database I am beginning to understand the appeal of a database without a fixed schema. Some of the tables have dozens of columns, with most of the rows being full of NULL values. So I do get that part, but no application I have ever worked on was like that.



Some of the tables have dozens of columns, with most of the rows being full of NULL values.

This is generally addressed in a relational design with a star schema. First create a dimension table:

    CREATE TABLE person (
      id BIGINT PRIMARY KEY NOT NULL
    )
Then create fact tables:

    CREATE TABLE person_name (
      person_id BIGINT REFERENCES person(id) UNIQUE,
      name VARCHAR(128) NOT NULL
    )

    CREATE TABLE person_bank_details (
      person_id BIGINT REFERENCES person(id) UNIQUE,
      bank_detail ....
    )
This avoids large numbers of rows containing nulls, but it violates a normal form. The mnemonic is that the table must contain the Key, the whole key, and nothing but the key, so help me Codd. Anytime you have a "REFERENCES table(pk) UNIQUE", you violate the "whole key" bit.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: