Contributing content types

Providing a new content type

The platform defines some fundamental content types, such as plain text and XML. These content types are defined the same way as those contributed by any other plug-ins. We will look at how the platform defines some of its content types in order to better understand the content type framework.

Plug-ins define content types by contributing an extension for the extension point org.eclipse.core.runtime.contentTypes. In this extension, a plug-in specifies a simple id and name for the content type (the full id is always the simple id prefixed by the current namespace). The following snippet shows a trimmed down version of the org.eclipse.core.runtime.text content type contribution:

	<extension point="org.eclipse.core.runtime.contentTypes">
		<content-type 
			id="text"
			name="%textContentTypeName">
			file-extensions="txt">
			<describer class="org.eclipse.core.internal.content.TextContentDescriber"/>
		</content-type>
		...

The file-extensions attribute defines what file extensions are associated with the content type (in this example, ".txt"). The file-names attribute (not used in this case) allows associating full names. Both attributes are taken into account by the platform when performing content type detection and description (if the client provides a file name).

The describer element is used to define a content describer for the content type.

Detecting and describing content

A content type should provide a content describer if there are any identifiable characteristics that allow automatic content type detection, or any interesting properties in data belonging to the content type. In the case of org.eclipse.core.runtime.text, it is not possible to figure out the content type by just looking at the contents. However, text streams might be prepended by a byte order mark, which is a property clients might be interested in knowing about, so this warrants a content describer.

The describer is an implementation of IContentDescriber or ITextContentDescriber. The latter is a specialization of the former that must be implemented by describers of text-oriented content types. Regardless the nature of the content type, the describer has two responsibilities: helping determining whether its content type is appropriate for a given data stream, and extracting interesting properties from a data stream that supposedly belongs to its content type.

The method describe(stream, description) is called whenever the platform is trying to determine the content type for a particular data stream or describe its contents. The description is null when only detection is requested. Otherwise, the describer should try to fill the content description with any properties that could be found by reading the stream, and only those. The content type markup should be used to declare any properties that have default values (for example, org.eclipse.core.runtime.xml declares UTF-8 as the default charset).

When performing its duty, the content describer is expected to execute as quickly as possible. The less the data stream has to be read, the better. Also, it is expected that the content describer implementation be declared in a package that is exempt from plug-in activation (see the Eclipse-AutoStart bundle manifest header). Since all describers are instantiated when the content type framework is initialized, failure in complying with this requirement causes premature activation, which must be avoided. Future implementations of the platform might refuse to instantiate describers if doing so would trigger activation of the corresponding plug-in.

Extending an existing content type

Content types are hierarchical in nature. This allows new content types to leverage the attributes or behavior of more general content types. For example, a content type for XML data is considered a child of the text content type:

<content-type 
	id="xml"
	name="%xmlContentTypeName"
	base-type="org.eclipse.core.runtime.text"
	file-extensions="xml">
	<describer class="org.eclipse.core.internal.content.XMLContentDescriber"/>
	<property name="charset" default="UTF-8"/>
</content-type>

A XML file is deemed a kind of text file, so any features applicable to the latter should be applicable to the former as well.

Note that the XML content type overrides several content type attributes originally defined in the Text content type such as the file associations and the describer implementation. Also, this content type declares a default property value for charset property. That means that during content description for a data stream considered as belonging to the XML content type, if the describer does not fill in the charset property, the platform will set it to be "UTF-8".

As another example, the org.eclipse.ant.core.antBuildFile content type (for Ant Build Scripts) extends the XML content type:

<content-type  
id="antBuildFile"
name="%antBuildFileContentType.name"
base-type="org.eclipse.core.runtime.xml"
file-names="build.xml"
file-extensions="macrodef,ent,xml">
<describer
class="org.eclipse.ant.internal.core.contentDescriber.AntBuildfileContentDescriber">
</describer>
</content-type>

Note that the default value for the charset property is inherited. It is possible to cancel an inherited property or describer by redeclaring them with the empty string as value.

Additional file associations

New file associations can be added to existing content types. For instance, the Resources plug-in associates the org.eclipse.core.runtime.xml to ".project" files:

<extension point="org.eclipse.core.runtime.contentTypes">
	<file-association content-type="org.eclipse.core.runtime.xml" file-names=".project"/>
	...

Content type aliasing

Due to the extensible nature of Eclipse, a content type a plug-in rely on may not be available in a given product configuration. This can be worked around by using content type aliasing. A content type alias is a placeholder for another preferred content type whose availability is not guaranteed. For instance, the Runtime declares an alias (org.eclipse.core.runtime.properties) for the Java properties content type provided by the Java development tools (JDT) (org.eclipse.jdt.core.javaProperties):

<!-- a placeholder for setups where JDT's official type is not available -->			
<content-type id="properties" name="%propertiesContentTypeName"
base-type="org.eclipse.core.runtime.text"
alias-for="org.eclipse.jdt.core.javaProperties"
file-extensions="properties">
<property name="charset" default="ISO-8859-1"/>
</content-type>

This provides plug-ins with a placeholder they can refer to regardless the preferred content type is available or not. If it is, the alias content type is supressed from the content type catalog and any references to it are interpreted as references to the target content type. If it is not, the alias will be used as an ordinary content type.