Fastest way making records unique in a huge PostgreSQL table

I am currently working on a project were we are collecting huge amount of data. Recently we discovered that millions of rows were not unique. Because we are using Machine Learning to find patterns in the data and to cluster it, these dublicate rows were polluting the results of the learning algoritme. Therefor we had to come up with a solution to remove all dublicates and prevent new dublicates in the future. Continue reading

How to Configure a Self Referencing Entity in Code First

Recently I was working on a project that required a categorization option. I wanted the user to have the option to create categories and as many subcategories as they wanted. To realize this I created a category model that was self referencing:

Continue reading