I'm working on two systems that have a common characteristic: entities whose attributes are dynamic (ie can not be "hard-coded"), need to be queried in search operations (result filtering) , but are otherwise decoupled from the rest of the system (ie there is nothing referring to them, nor are they used in any particular calculation). I'm having trouble finding an appropriate way to represent them.
So far, I've been using the (anti-) EAV . There is a Entity
table that identifies the entity, a Tag
( NxN
with Entity
) table that says the "type" - or "class" - of the entity, a Attribute
table Nx1
) that identifies the property, its type, and whether or not it can be absent, and finally several tables Tag
( XValue
, TextValue
, IntValue
etc) that say "entity E possesses to the attribute A is the value V ".
Contextualizing an entity could be a product for sale, its tag the type of product (eg "car"), attributes being the characteristics of a car (eg "brand", "model", " kilometers ") and the values of the application of those characteristics to the product for sale (eg" Fiat "," Palio "," 20000 "). Types of products can be created, altered and deleted (dynamic), you can consult a product type by its characteristics (eg filter cars by the "Fiat" brand), but there is no foreign key for a specific attribute, and ever there will be any "non-generic" calculation involving product features.
And, as I said, I currently implement this in the form of EAV templates - in a PostgreSQL database, cross-platform at first but more towards Linux.
Usually using EAV is a "potential WTF alert," but I believe that this would be one of its legitimate uses (the data structures modeled in this way are actually data , not < strong> meta-data ). The vast majority of the bank is formalized, only a few entities were modeled in this way (and I took steps to ensure that it would be possible to formalize any model that might be coupled to the rest of the system.)
However, I'm looking for alternatives to this model - since it's laborious to deal with and full of pitfalls . I thought of representing each entity by a simple XML, but there is the requirement of the filtered search - which I do not know how to do [efficiently]. I have no experience with non-relational DBMS ( "NoSQL" ), so I do not know if its performance characteristics would be acceptable (I am providing for a high number of readings - filtered searches - simultaneous as well as simultaneous writes - even if not in the same entity. Other ideas came to mind, but they seem to me to be "crazy" to use in practice (eg dynamically change the DB schema whenever a "type" is modified).
I ask that answers be based on previous experience involving systems with similar requirements, not just theoretical opinion and / or background, if possible. Suggestions for specific technologies are welcome, but what I'm looking for is a solution strategy , not a software recommendation .
Update: For those not familiar with "Entity-Attribute-Value" (unfortunately I did not find any Portuguese material on the subject)
In a traditional (formalized) modeling, a "Car" entity could be represented as:
Carros
ID Marca Modelo Km
----------------------------------
1 Fiat Palio 20000
2 Honda Fit 10000
In an EAV modeling, this is done like this:
Entidades Atributos Valores
ID Tipo ID Nome Entidade Atributo Valor
---------- ------------- -----------------------
1 Carro 1 Marca 1 1 Fiat
2 Carro 2 Modelo 1 2 Palio
3 Km 1 3 20000
2 1 Honda
2 2 Fit
2 3 10000
The advantage of this representation is that the "Car" type is not hardcoded: if I want to add a "Color" attribute, I do not need to tinker with the table structure, create new queries to deal with this attribute, etc. - just create one more line in the DateValue
table and for each car one more line in the Atributos
table. If in addition to cars I want to represent something else (eg real estate), just create lines in the Valores
table with this new type, and give it attributes in the same way as "Cars".
The disadvantages are ... well, all possible and imaginable ! Generally, this is a standard to be avoided at all costs (i.e., an "anti-standard"). However, even the greatest critics admit that - in certain cases - it is inevitable. My question is: 1) Does this apply to my case? or is there a formalized way to meet my requirements? 2) if it is inevitable, if there are good alternatives that have acceptable performance.