The IBGE Aggregate Data API (version 3) is the programmatic interface behind SIDRA, IBGE’s automatic data retrieval system. It covers every survey and census produced by the Brazilian Institute of Geography and Statistics.
This vignette explains the API’s data model so you can make the most of ibger. If you’re familiar with OLAP terminology: variables = measures, classifications = dimensions, and categories = members.
An aggregate is a specific table of results from an IBGE survey. Each aggregate has a numeric ID that is stable over time. For example:
You can filter by periodicity, geographic level, subject, or classification:
Each aggregate exposes one or more variables — the measures being reported. Aggregate 1712 (crop production), for example, has:
| ID | Variable |
|---|---|
| 214 | Quantidade produzida (Production qty) |
| 215 | Valor da produção (Production value) |
| 216 | Área colhida (Harvested area) |
| 1982 | Quantidade vendida (Sold qty) |
| … | … |
When calling ibge_variables(), you can request specific
variables by ID:
Use variable = NULL (default) for all standard
variables, or variable = "all" to include API-generated
percentage variables when available.
Besides being linked to a locality and a period, each observation can be further broken down by classifications (dimensions). Each classification contains categories (members).
For aggregate 1712, the classifications are things like “type of product” (226), “producer condition” (218), “economic activity group”, etc. Classification 226 has categories like “pineapple” (4844), “garlic” (96608), “potato” (96609), and hundreds more.
meta <- ibge_metadata(1712)
meta$classifications
# Unnest to see all categories
tidyr::unnest(meta$classifications, categories)When you don’t specify a classification, the API returns results for the Total category (ID = 0). This is a special aggregate across all categories.
# Default: Total category (aggregated across all products)
ibge_variables(1712, localities = "BR")
# Specific products
ibge_variables(
1712,
localities = "BR",
classification = list("226" = c(4844, 96608))
)
# All products (can be large)
ibge_variables(
1712,
periods = -1,
localities = "BR",
classification = list("226" = "all")
)IBGE organizes Brazil into a hierarchy of geographic levels. Each aggregate supports a specific subset of these levels:
| Code | Level | Count | Example |
|---|---|---|---|
N1 |
Brazil | 1 | BR |
N2 |
Major region | 5 | 1 (North), 3 (Southeast) |
N3 |
State (UF) | 27 | 33 (RJ), 35 (SP) |
N6 |
Municipality | 5,570+ | 3550308 (São Paulo city) |
N7 |
Metropolitan area | varies | 3501 (RM São Paulo) |
N9 |
Immediate region | varies | … |
N15 |
Intermediate region | varies | … |
Important: municipality IDs (N6) and metropolitan area IDs (N7) use different numbering. São Paulo city is 3550308 (N6), while the São Paulo metropolitan area is 3501 (N7). Don’t confuse them.
The available levels for each aggregate are in the metadata:
You can request all localities at a level, or pick specific ones:
# All states
ibge_variables(1705, localities = "N3")
# Specific states
ibge_variables(1705, localities = list(N3 = c(33, 35)))The API also supports contextual queries — filtering
municipalities by their parent state or region. For example,
N6[N3[33,35],N2[1]] means “all municipalities in RJ, SP, or
the North region”. ibger passes this through directly:
Each aggregate has a fixed periodicity:
| Code | Periodicity |
|---|---|
P5 |
Monthly |
P10 |
Quarterly |
P13 |
Annual |
P58 |
Semi-annual |
Period codes encode both the date and periodicity. The code
202001 means different things depending on the aggregate’s
periodicity:
P5): January 2020P10): Q1 2020P58): S1 2020The metadata tells you the valid range:
meta <- ibge_metadata(7060)
meta$periodicity
#> $frequency [1] "mensal"
#> $start [1] "202001"
#> $end [1] "202512"ibger’s ibge_periods() lists every individual
period:
The API allows at most 100,000 values per request. The formula is:
categories × periods × localities ≤ 100,000
For example, a request for aggregate 2654 with:
produces 1 × 2 × 2 × 1 × 6 × 4 = 96 values — well within the limit.
If your request exceeds 100,000 values, the API returns HTTP 500. Reduce the number of localities, periods, or categories and retry.
The API supports three view modes for the response format. ibger uses
the default JSON mode, but you can also pass view = "OLAP"
or view = "flat":
# OLAP notation
ibge_variables(1705, localities = "BR", view = "OLAP")
# Flat mode (first element is metadata, data starts at second)
ibge_variables(1705, localities = "BR", view = "flat")In most cases, the default mode with ibger’s tidy output is the most convenient.
Here is a quick reference showing how ibger functions correspond to API endpoints:
| ibger function | API endpoint |
|---|---|
ibge_aggregates() |
GET /agregados |
ibge_metadata() |
GET /agregados/{id}/metadados |
ibge_periods() |
GET /agregados/{id}/periodos |
ibge_localities() |
GET /agregados/{id}/localidades/{nivel} |
ibge_variables() |
GET /agregados/{id}/periodos/{p}/variaveis/{v} |
The ibger parameters map to URL path segments and query parameters:
| ibger parameter | API parameter | Format |
|---|---|---|
aggregate |
{agregado} (path) |
Numeric ID |
variable |
{variavel} (path) |
214\|1982 or all or
allxp |
periods |
{periodos} (path) |
-6 or 201701-201706 or
201701\|201702 |
localities |
localidades (query) |
BR or N3 or
N6[3550308,3304557] |
classification |
classificacao (query) |
226[4844,96608]\|218[4780] |
view |
view (query) |
OLAP or flat |