<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: Drupal Implementation Decisions (part 7)</title>
	<atom:link href="http://cruncht.com/91/drupal-implementation-decisions/feed/" rel="self" type="application/rss+xml" />
	<link>http://cruncht.com/91/drupal-implementation-decisions/</link>
	<description>Semantic web development and publishing</description>
	<lastBuildDate>Mon, 16 Jan 2012 17:27:58 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	
<xhtml:meta xmlns:xhtml="http://www.w3.org/1999/xhtml" name="robots" content="noindex" />
	<item>
		<title>By: Murray Woodman</title>
		<link>http://cruncht.com/91/drupal-implementation-decisions/#comment-1473</link>
		<dc:creator>Murray Woodman</dc:creator>
		<pubDate>Thu, 14 Apr 2011 01:17:43 +0000</pubDate>
		<guid isPermaLink="false">http://cruncht.com/?p=91#comment-1473</guid>
		<description>In Drupal 6 a new table will be made if the field is a multi or is shared. In Drupal 7 a field table is made for fields no matter what. Personally, I was quite happy with the D6 design because it was closest to standard relational DB practices. It did lead to shifting schemas when field definitions changed and this was the cause of problems. However, that is worth living with IMO.

The comments I made above which you have picked up on related to content types with a lot of properties (potentially hundreds). &quot;This can be an issue on very large sites on nodes with a lot of properties when you want to keep DB queries to a minimum.&quot; You really really, really want to be able to get that data back for  a node without (i) doing joins or (ii) looking that data up again with a separate query.

I ran into this problem when I was importing DBpedia (semantic version of Wikipedia) into Drupal 6. DBpedia has hundreds of property types for the various classes. During the data conversion I had to make prudent decisions about just what property types were worth importing. In this case it would be silly to have a property (shared field) used by 99% of instances in one class, but only 1% of instances in another. There is little point in slowing down access times to pick up one piece of data which will be missing for the majority of cases. In the end I opted for shared fields where there was good data density across classes and single fields where there was good data density within a class. 

All I was trying to say above was that there is no sense in smushing fields which are semantically similar into a single multifield. If there is a difference then keep it. If they are the same, then make a multi. 

Yes. This is an esoteric example and perhaps I made too big a deal of it in the article. Generally, I say model your data the best way you can. Drupal has very good tools for building schemas and you shouldn&#039;t worry about stuff like this for the most part.

Mind you, for D7 I have misgivings about the way fields have gone. With the current design it would be impossible to import (most of) DBpedia into D7 as I have done for D6 because of all the lookups or joins in Views. You would have hundreds (thousands?) of tables all supporting a different property! In this case, the fragmentation of data in MySQL does make a difference to efficiency. So, the scalability of MySQL has been reduced under D7 for large datasets using fields. 

HOWEVER, I think when you are pushing millions of rows and hundreds of properties it would be a sensible decision to switch to MongoDB or similar. You then get the benefit of fast lookup and multi indexed querying. I have yet to investigate that. I&#039;m only a novice with this stuff but I&#039;d say that a Mongo backend for nodes could be a fruitful avenue for the future for all medium to large sites given the way things have developed in MySQL.</description>
		<content:encoded><![CDATA[<p>In Drupal 6 a new table will be made if the field is a multi or is shared. In Drupal 7 a field table is made for fields no matter what. Personally, I was quite happy with the D6 design because it was closest to standard relational DB practices. It did lead to shifting schemas when field definitions changed and this was the cause of problems. However, that is worth living with IMO.</p>
<p>The comments I made above which you have picked up on related to content types with a lot of properties (potentially hundreds). &#8220;This can be an issue on very large sites on nodes with a lot of properties when you want to keep DB queries to a minimum.&#8221; You really really, really want to be able to get that data back for  a node without (i) doing joins or (ii) looking that data up again with a separate query.</p>
<p>I ran into this problem when I was importing DBpedia (semantic version of Wikipedia) into Drupal 6. DBpedia has hundreds of property types for the various classes. During the data conversion I had to make prudent decisions about just what property types were worth importing. In this case it would be silly to have a property (shared field) used by 99% of instances in one class, but only 1% of instances in another. There is little point in slowing down access times to pick up one piece of data which will be missing for the majority of cases. In the end I opted for shared fields where there was good data density across classes and single fields where there was good data density within a class. </p>
<p>All I was trying to say above was that there is no sense in smushing fields which are semantically similar into a single multifield. If there is a difference then keep it. If they are the same, then make a multi. </p>
<p>Yes. This is an esoteric example and perhaps I made too big a deal of it in the article. Generally, I say model your data the best way you can. Drupal has very good tools for building schemas and you shouldn&#8217;t worry about stuff like this for the most part.</p>
<p>Mind you, for D7 I have misgivings about the way fields have gone. With the current design it would be impossible to import (most of) DBpedia into D7 as I have done for D6 because of all the lookups or joins in Views. You would have hundreds (thousands?) of tables all supporting a different property! In this case, the fragmentation of data in MySQL does make a difference to efficiency. So, the scalability of MySQL has been reduced under D7 for large datasets using fields. </p>
<p>HOWEVER, I think when you are pushing millions of rows and hundreds of properties it would be a sensible decision to switch to MongoDB or similar. You then get the benefit of fast lookup and multi indexed querying. I have yet to investigate that. I&#8217;m only a novice with this stuff but I&#8217;d say that a Mongo backend for nodes could be a fruitful avenue for the future for all medium to large sites given the way things have developed in MySQL.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: James Morrish</title>
		<link>http://cruncht.com/91/drupal-implementation-decisions/#comment-1471</link>
		<dc:creator>James Morrish</dc:creator>
		<pubDate>Wed, 13 Apr 2011 14:01:08 +0000</pubDate>
		<guid isPermaLink="false">http://cruncht.com/?p=91#comment-1471</guid>
		<description>Does it really make a difference if I share a CCK field across multiple content types? CCK creates a field_{name} table if it is shared or not, so there will be no improvement in efficiency by keeping them separate? And if you&#039;re sharing fields then it&#039;s because they contain the same type of content - so it&#039;s fairly likely you&#039;d use a view to pull that data regardless across multiple content types.</description>
		<content:encoded><![CDATA[<p>Does it really make a difference if I share a CCK field across multiple content types? CCK creates a field_{name} table if it is shared or not, so there will be no improvement in efficiency by keeping them separate? And if you&#8217;re sharing fields then it&#8217;s because they contain the same type of content &#8211; so it&#8217;s fairly likely you&#8217;d use a view to pull that data regardless across multiple content types.</p>
]]></content:encoded>
	</item>
</channel>
</rss>

<!-- Performance optimized by W3 Total Cache. Learn more: http://www.w3-edge.com/wordpress-plugins/

Minified using disk: basic
Page Caching using disk: enhanced
Content Delivery Network via cdn-small.cruncht.com

Served from: cruncht.com @ 2012-02-04 19:09:35 -->
