cpe
Elite Observer

Remove language dependent columns from tables

Dear Community,

picture this:

I have a project with multiple languages. In my database I have language dependent fields which result in multiple columns. For example I could have something like a table "Person" which has the columns:

  • name_DE
  • name_EN
  • name_NL
  • ...

When I now remove a language, the mappings in the TableTemplate are adjusted automatically. But the columns in the database are not removed.

That is a good thing, because we do not want to lose data by accident.

But how would I go about it, when I do want to remove the columns?

The classes concerning the database have no information that name_DE and name_EN etc. belong together. There is not even information in the column if it originally was language dependent at all.

I can not go ahead and delete everything that ends with "_DE" as there might be a valid column named something which ends with "_DE".

I can not be the first person to want to get rid of a language in the database. How is it done?

Any hints are very welcome!

Cheers
Connz

0 Kudos
Reply
4 Replies
hoebbel
Crownpeak employee

Hello Connz,

As you already wrote FirstSpirit never deletes database columns, as this would result in data loss.
The easiest way to remove unnecessary columns is to clean up the schema and then export and import the project again. The newly imported project will only contain the columns configured in the schema at the time of the export.

If you prefer (or an export/import of the project isn't possible), you can use Content Transport or External Sync to transfer the schema to a new project containing only the desired languages. However, the option 'use mapping based schema merge' must then be activated in the target project. This automatically removes language-dependent columns that are not required in the target project (and creates columns for additional languages). The schema created in this way can then be used either as a blueprint to tidy up the source project's database or, if the old versions of the datasets are not required, it can be merged into the original project with the help of Customer Support. (a possible pitfall here: if you delete the temporary project, be careful not to delete the database content!)

Or do you just want to cleanup the schema.xml? In this case you can use the ContentTransport/ExternalSync path and copy the schema.xml into the original schema afterwards. Just be very careful not to change the first three lines of the original schema, since these define where the database content is stored. In case of doubt ask the customer support for assistance!

Best regards
Holger

0 Kudos
Reply
cpe
Elite Observer

Hey Holger,

thanks for the explanation! First a question that would go before your solutions:

How do I clean up the schema or schema.xml? Or to be more specific: How do I not do that per hand and in a script instead? How can I identify the columns that need to go?

I came up with different ways, but in every one I could find an example where I would delete other columns as well.

Then another question concerning your solutions: If I do have a way to clean up the schema, why would I need to export and import that in one way or another?

Cheers
Connz

0 Kudos
Reply
hoebbel
Crownpeak employee

Hello Connz,

How do I clean up the schema or schema.xml?
See the last section of my first answer 🙂 
Just make a content transport of the schema into a new project. Within this project the option 'use mapping based schema merge' must be activated and only the languages you want to keep have to be defined => only the columns for these languages should remain within the schema. 
[I didn't test it for this answer, so it may be that I am too optimistic here and only the columns ending in the corresponding language abbreviation are missing.]
And another possible pitfall: you have to use the same database type, otherwise the names within the database may change.
BUT while writing this => if the schema in the source project has been changed, it is possible, that the values of the database names might change during the content transport. This will happen, if a table or column has been changed in an incompatible way in the source project (in this case a new column/table was created with _# in its name {depending on the database used it could instead also be just #}. In this case you would have to change the database names back to the values of the original schema.xml. Sorry, I missed this within the first answer and this makes the solution (more or less) useless. ☹️
IF it works, you can edit the schema.xml und copy all of it to the source project. BUT you mustn't change the first 3 lines of the schema.xml of the source project (or to be more accurate, not the values of the third line), since these define where FirstSpirit is expecting the database content within the database.

 

Cleanup of the schema.xml: IF the language specific columns are created with the appropriate option of FirstSpirit, you should always find columns with the schema "name" + "_" + "language abbreviation". 
But if language specific columns had been created manually as a new column, they can have any name schema 😞
Furthermore it is possible to map any column to any language. So for a perfect solution, you would have to iterate over all table templates, collect the columns (of a table) mapped to a specific language and afterward delete all columns, that are only mapped to the language you want to get rid of. (and maybe the columns, that aren't mapped at all???).

 

My first answer focused on the fact that the columns should be deleted in the database itself - not in the schema.xml. If the deleted columns can remain within the database, an export/import is not necessary at all. 

Best regards
Holger

0 Kudos
Reply
cpe
Elite Observer

Hey Holger,

ah, I misunderstood the last section of your first answer! That sounds like a workable solution, thanks! I do not think that my schema has any names with "_#" in them, so that should work. But this sounds like a one-time thing with setting up another project, transporting the schema to and back from it.

What I had in mind would have been something like a script, which goes over the schema and removes the columns there. I had a look at the API for that and found the following issues:

First, the naming schema is tricky in two ways: "name" + "_" + "language abbreviation"

There can also be "_" in the name and in the language abbreviation. Say I want to delete all columns from german, so get rid of everything ending with "_DE"? What if there is a column from a language independent field named "something_DE"? I would falsely also delete that.

If I use the Mappings in the TableTemplates as basis, I would for example come across a field named "something" and could remove anything that is not "something" + "_" + one of the remaining language abbreviations. If by chance there also is a field called "something_else", I would also delete all of those. I do not know which languages need to be deleted, so I could not delete "something_DE" here.

I also thought of just deleting everything that is not mapped. But there could be other fields which are not mapped on purpose. So that's not it either.

I started of with a script that does the opposite. It looks up Mappings and if it finds missing fields for new languages, it adds those (and the mappings to them). Our customer asked to get a remove_fields to the existing add_fields. Sounds simple, but turns out rather tricky. 😉

I had a look into the API and found nothing in the Schema and Columns that would give the information which columns belong together. They are created together, but as soon as they are created, there is no connection any more. I would love to know how the content transport can distinguish them when importing into a new project. Probably parts that are not in the public API?

I am afraid my curiosity needs to stop soon, because this is not a frequently occurring problem and cost/benefit is getting out of hand...

Cheers
Connz

0 Kudos
Reply