LDA for World Trade - Analysis Dashboard

This dashboard was made to analyse the results associated with the article Kozlowski D, Semeshenko V, Molinari A (2021) Latent Dirichlet allocation model for world trade analysis. PLoS ONE 16(2): e0245393. https://doi.org/10.1371/journal.pone.0245393

In the panel Components we show the results for the LDA model for different numbers of components, k. We use this panel for selecting the optimum value of k, and for labelling each component on the selected model.

In the panel Countries we show the results for the selected model (k=30), using the labels previously made.

On the tab Components by country we show, for a given country or set of countries, the most relevant components of its export along time.

On the tab Countries by component we show, for a given component or set of components, those countries that have the highest proportion of their exports in that component, by decade.

Select the model (number of components, K)

Select the component

Analysis of the Components

The characterisation of the model for different k, and the posterior labelling of components is a process that includes the following steps:

1. We first decided the quantity of products to analyse, up to 10 with the highest share. The more concentrated (following the cumulative probability), the less products are needed for a good characterisation.

2. We then defined a concept that generalises products with the highest share. For example, for k = 30, in component 1 the first four products are coal, iron, gold and aluminium, which can then be labelled as Minerals.

3. For components where the top 10 products have a cumulative share of less than 30%, we looked at the overall distribution of the component in Lall's groups (see below) ¹

4. After studying all the components of different models (i.e. different values of k), we selected the model that satisfies the following criteria:

a. It is feasible to label most of the k components;

b. Components do not repeat among themselves;

c. The distribution of components gives a high cumulative share (more than 30%) for the first 10 products in the majority of the components.

With this analysis we label the components by Group, Subgroup and level of complexity, for the selected model.

Lall classification of components

The Lall classification divides products by origin (primary products, resourced based manufactures and non resourced based manufactures) and technology (low-medium-high)

For each component, we project the distribution as a weighted average of Lall's groups, using the share of each product in the component.

Given that not all products in SITC are classified by Lall, some of them go to the Unclassified products group.

If the first 10 products of the component have a cumulative share of less than 30%, the Lall groups allow us to characterize it:

- If the distribution is skewed, this means that the component is still well defined, but includes multiple products of the same type (see for example, component 4 for k = 30),

- If the distribution is uniform, this means that the component is ill defined (see for example, component 2 for k = 2).

Maintainer: Diego Kozlowski

Select the countries:

smooth

Number of components by plot:

This plot shows the components with the highest average participation in the export of each of the selected countries.

The smooth fit a loess regression for each component

We can study the changes in the specialisation patterns of each country, by looking at the shifts over time from one component to another.

Top components by country

Select the components:

Number of countries by plot:

This plot shows the countries which have the highest participation in a component. The horizontal axis show how much the component represent on that country's exports, on average, on that decade ^*

LDA for World Trade - Analysis Dashboard

Analysis of the Components

The characterisation of the model for different k, and the posterior labelling of components is a process that includes the following steps:

1. We first decided the quantity of products to analyse, up to 10 with the highest share. The more concentrated (following the cumulative probability), the less products are needed for a good characterisation.

2. We then defined a concept that generalises products with the highest share. For example, for k = 30, in component 1 the first four products are coal, iron, gold and aluminium, which can then be labelled as Minerals.

3. For components where the top 10 products have a cumulative share of less than 30%, we looked at the overall distribution of the component in Lall's groups (see below) ¹

4. After studying all the components of different models (i.e. different values of k), we selected the model that satisfies the following criteria:

a. It is feasible to label most of the k components;

b. Components do not repeat among themselves;

c. The distribution of components gives a high cumulative share (more than 30%) for the first 10 products in the majority of the components.

With this analysis we label the components by Group, Subgroup and level of complexity, for the selected model.

Lall classification of components

The Lall classification divides products by origin (primary products, resourced based manufactures and non resourced based manufactures) and technology (low-medium-high)

For each component, we project the distribution as a weighted average of Lall's groups, using the share of each product in the component.

Given that not all products in SITC are classified by Lall, some of them go to the Unclassified products group.

If the first 10 products of the component have a cumulative share of less than 30%, the Lall groups allow us to characterize it:

- If the distribution is skewed, this means that the component is still well defined, but includes multiple products of the same type (see for example, component 4 for k = 30),

- If the distribution is uniform, this means that the component is ill defined (see for example, component 2 for k = 2).

Maintainer: Diego Kozlowski

This plot shows the components with the highest average participation in the export of each of the selected countries.

The smooth fit a loess regression for each component

We can study the changes in the specialisation patterns of each country, by looking at the shifts over time from one component to another.

Top components by country

This plot shows the countries which have the highest participation in a component. The horizontal axis show how much the component represent on that country's exports, on average, on that decade ^*

With this plot, we can study how different countries participate in a component history.

We can also study how a component represent obsolete technologies (see for example, component 5) or modern ones (see for example, component 27

Top countries by component

LDA for World Trade - Analysis Dashboard

Analysis of the Components

The characterisation of the model for different k, and the posterior labelling of components is a process that includes the following steps:

1. We first decided the quantity of products to analyse, up to 10 with the highest share. The more concentrated (following the cumulative probability), the less products are needed for a good characterisation.

2. We then defined a concept that generalises products with the highest share. For example, for k = 30, in component 1 the first four products are coal, iron, gold and aluminium, which can then be labelled as Minerals.

3. For components where the top 10 products have a cumulative share of less than 30%, we looked at the overall distribution of the component in Lall's groups (see below) 1

4. After studying all the components of different models (i.e. different values of k), we selected the model that satisfies the following criteria:

a. It is feasible to label most of the k components;

b. Components do not repeat among themselves;

c. The distribution of components gives a high cumulative share (more than 30%) for the first 10 products in the majority of the components.

With this analysis we label the components by Group, Subgroup and level of complexity, for the selected model.

Lall classification of components

The Lall classification divides products by origin (primary products, resourced based manufactures and non resourced based manufactures) and technology (low-medium-high)

For each component, we project the distribution as a weighted average of Lall's groups, using the share of each product in the component.

Given that not all products in SITC are classified by Lall, some of them go to the Unclassified products group.

If the first 10 products of the component have a cumulative share of less than 30%, the Lall groups allow us to characterize it:

- If the distribution is skewed, this means that the component is still well defined, but includes multiple products of the same type (see for example, component 4 for k = 30),

- If the distribution is uniform, this means that the component is ill defined (see for example, component 2 for k = 2).

Maintainer: Diego Kozlowski

This plot shows the components with the highest average participation in the export of each of the selected countries.

The smooth fit a loess regression for each component

We can study the changes in the specialisation patterns of each country, by looking at the shifts over time from one component to another.

Top components by country

This plot shows the countries which have the highest participation in a component. The horizontal axis show how much the component represent on that country's exports, on average, on that decade *

With this plot, we can study how different countries participate in a component history.

We can also study how a component represent obsolete technologies (see for example, component 5) or modern ones (see for example, component 27

Top countries by component

3. For components where the top 10 products have a cumulative share of less than 30%, we looked at the overall distribution of the component in Lall's groups (see below) ¹

This plot shows the countries which have the highest participation in a component. The horizontal axis show how much the component represent on that country's exports, on average, on that decade ^*