--- title: "Working with OData Queries" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Working with OData Queries} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>", eval = FALSE ) ``` ## Introduction to OData The BCB PIX Open Data API uses the OData (Open Data Protocol) standard for data access. OData provides a standardized way to query and manipulate data using RESTful APIs. The **pixr** package provides full support for OData query parameters including `$filter`, `$orderby`, `$select`, `$top`, and `$skip`. ## API Endpoints and Parameters Each endpoint requires a specific date/database parameter: | Endpoint | Parameter | Format | R Function | |----------|-----------|--------|------------| | ChavesPix | `Data` | YYYY-MM-DD | `get_pix_keys(date = "2025-12-01")` | | TransacoesPixPorMunicipio | `DataBase` | YYYYMM | `get_pix_transactions_by_municipality(database = "202512")` | | EstatisticasTransacoesPix | `Database` | YYYYMM | `get_pix_transaction_stats(database = "202509")` | | EstatisticasFraudesPix | `Database` | YYYYMM | `get_pix_fraud_stats(database = "202509")` | ## Filtering with $filter All main functions accept a `filter` parameter that uses OData filter syntax. ### Basic Filter Syntax ```{r filter-basic} library(pixr) # Filter transactions by state get_pix_transactions_by_municipality( database = "202512", filter = "Estado eq 'MARANHÃO'" ) # Filter by region get_pix_transactions_by_municipality( database = "202512", filter = "Sigla_Regiao eq 'NE'" ) # Filter transaction stats by nature get_pix_transaction_stats( database = "202509", filter = "NATUREZA eq 'P2P'" ) # Filter PIX keys by key type get_pix_keys( date = "2025-12-01", filter = "TipoChave eq 'CPF'" ) ``` ### Comparison Operators | Operator | Meaning | Example | |----------|---------|---------| | `eq` | Equal | `"Estado eq 'SÃO PAULO'"` | | `ne` | Not equal | `"NATUREZA ne 'P2P'"` | | `gt` | Greater than | `"VALOR gt 1000"` | | `ge` | Greater or equal | `"QUANTIDADE ge 100"` | | `lt` | Less than | `"VALOR lt 5000"` | | `le` | Less or equal | `"qtdChaves le 1000"` | ```{r comparison-examples} # Greater than get_pix_transaction_stats( database = "202509", filter = "VALOR gt 10000" ) # Less than or equal get_pix_keys( date = "2025-12-01", filter = "qtdChaves le 1000" ) ``` ### Logical Operators Combine multiple conditions with `and` and `or`: | Operator | Example | |----------|---------| | `and` | `"NATUREZA eq 'P2P' and PAG_REGIAO eq 'SUDESTE'"` | | `or` | `"Estado eq 'SP' or Estado eq 'RJ'"` | ```{r logical-examples} # AND - both conditions must be true get_pix_transaction_stats( database = "202509", filter = "NATUREZA eq 'P2P' and PAG_REGIAO eq 'SUDESTE'" ) # OR - either condition can be true get_pix_transactions_by_municipality( database = "202512", filter = "Estado eq 'SÃO PAULO' or Estado eq 'RIO DE JANEIRO'" ) # Complex filter with multiple conditions get_pix_transaction_stats( database = "202509", filter = "NATUREZA eq 'P2P' and PAG_REGIAO eq 'NORDESTE' and VALOR gt 1000" ) ``` ### String Functions The API supports string functions for text matching: | Function | Description | Example | |----------|-------------|---------| | `contains(field, value)` | Substring match | `"contains(Nome, 'BANCO')"` | | `startswith(field, value)` | Starts with | `"startswith(Nome, 'COOP')"` | | `endswith(field, value)` | Ends with | `"endswith(Nome, 'S.A.')"` | ```{r string-functions} # Find institutions containing "SICREDI" get_pix_keys( date = "2025-12-01", filter = "contains(Nome, 'SICREDI')" ) # Find cooperatives (starting with COOP) get_pix_keys( date = "2025-12-01", filter = "startswith(Nome, 'COOP')" ) ``` ## Ordering with $orderby Sort results by any column using the `orderby` parameter. ### Basic Ordering ```{r orderby-basic} # Ascending order (default) get_pix_keys( date = "2025-12-01", orderby = "qtdChaves" ) # Descending order (add "desc") get_pix_keys( date = "2025-12-01", orderby = "qtdChaves desc" ) # Order municipalities alphabetically get_pix_transactions_by_municipality( database = "202512", orderby = "Municipio" ) # Order by value descending get_pix_transaction_stats( database = "202509", orderby = "VALOR desc" ) ``` ### Combining Filter and OrderBy ```{r filter-orderby} # Filter by state and order by municipality name (descending) get_pix_transactions_by_municipality( database = "202512", filter = "Estado eq 'MARANHÃO'", orderby = "Municipio desc", top = 10 ) # Filter P2P transactions and order by value get_pix_transaction_stats( database = "202509", filter = "NATUREZA eq 'P2P'", orderby = "VALOR desc", top = 100 ) # Filter by key type and order by quantity get_pix_keys( date = "2025-12-01", filter = "TipoChave eq 'CPF' and NaturezaUsuario eq 'PF'", orderby = "qtdChaves desc", top = 50 ) ``` ## Selecting Columns with $select Retrieve only the columns you need using the `columns` parameter: ```{r select-example} # Select specific columns for PIX keys get_pix_keys( date = "2025-12-01", columns = c("Nome", "ISPB", "TipoChave", "qtdChaves"), top = 100 ) # Select specific columns for transactions get_pix_transactions_by_municipality( database = "202512", columns = c("Estado", "Municipio", "VL_PagadorPF", "QT_PagadorPF"), top = 100 ) ``` ## Pagination with $top and $skip Control the number of results returned: ```{r pagination-example} # Get first 10 records get_pix_keys(date = "2025-12-01", top = 10) # Skip first 100 records, get next 50 get_pix_keys(date = "2025-12-01", top = 50, skip = 100) ``` ### Iterating Through All Data For large datasets, use pagination to download in batches: ```{r pagination-loop} # Download all data in chunks all_data <- list() skip <- 0 batch_size <- 1000 repeat { batch <- get_pix_keys( date = "2025-12-01", top = batch_size, skip = skip, verbose = FALSE ) if (nrow(batch) == 0) break all_data <- c(all_data, list(batch)) skip <- skip + batch_size if (nrow(batch) < batch_size) break } # Combine all batches final_data <- dplyr::bind_rows(all_data) ``` ## Custom Queries with pix_query() For advanced use cases, use `pix_query()` with raw OData parameters: ```{r custom-query} # Custom query with all parameters pix_query( endpoint = "TransacoesPixPorMunicipio", params = list(DataBase = "202512"), filter = "Sigla_Regiao eq 'SE'", select = c("Estado", "Municipio", "VL_PagadorPF"), orderby = "VL_PagadorPF desc", top = 50 ) # Custom query for transaction stats pix_query( endpoint = "EstatisticasTransacoesPix", params = list(Database = "202509"), filter = "NATUREZA eq 'P2B' and FORMAINICIACAO eq 'QRDN'", orderby = "QUANTIDADE desc", top = 100 ) ``` ## Building and Debugging URLs Use `pix_url()` to see the URL that would be called: ```{r debug-url} # See the URL for a query pix_url( "TransacoesPixPorMunicipio", params = list(DataBase = "202512"), filter = "Estado eq 'MARANHÃO'", orderby = "Municipio desc", top = 10 ) # Returns: # https://olinda.bcb.gov.br/olinda/servico/Pix_DadosAbertos/versao/v1/odata/ # TransacoesPixPorMunicipio(DataBase=@DataBase)?$format=json&@DataBase='202512' # &$filter=Estado eq 'MARANHÃO'&$orderby=Municipio desc&$top=10 ``` ## Practical Examples ### Top 10 States by Transaction Volume ```{r example-states} library(dplyr) # Get all municipalities and aggregate by state get_pix_transactions_by_municipality(database = "202512") |> group_by(Estado) |> summarise( total_value = sum(VL_PagadorPF + VL_PagadorPJ), total_count = sum(QT_PagadorPF + QT_PagadorPJ), n_municipalities = n() ) |> arrange(desc(total_value)) |> head(10) ``` ### P2P vs P2B Transactions by Region ```{r example-nature} # Get P2P transactions by region p2p <- get_pix_transaction_stats( database = "202509", filter = "NATUREZA eq 'P2P'" ) |> group_by(PAG_REGIAO) |> summarise(p2p_value = sum(VALOR)) # Get P2B transactions by region p2b <- get_pix_transaction_stats( database = "202509", filter = "NATUREZA eq 'P2B'" ) |> group_by(PAG_REGIAO) |> summarise(p2b_value = sum(VALOR)) # Join and compare left_join(p2p, p2b, by = "PAG_REGIAO") ``` ### Market Share by Key Type ```{r example-keys} # Get keys by type get_pix_keys(date = "2025-12-01") |> group_by(TipoChave) |> summarise(total_keys = sum(qtdChaves)) |> mutate(share = total_keys / sum(total_keys) * 100) |> arrange(desc(share)) ``` ## Performance Tips 1. **Use filters server-side**: Let the API filter instead of downloading all data ```{r tip-filter} # Good: Filter on the server get_pix_transactions_by_municipality( database = "202512", filter = "Estado eq 'SÃO PAULO'" ) # Avoid: Downloading all data then filtering in R get_pix_transactions_by_municipality(database = "202512") |> dplyr::filter(Estado == "SÃO PAULO") ``` 2. **Select only needed columns**: Reduces response size ```{r tip-select} # Good: Request only what you need get_pix_keys( date = "2025-12-01", columns = c("Nome", "qtdChaves") ) ``` 3. **Use `top` during development**: Limit results when exploring ```{r tip-top} # Good: Explore with limited data get_pix_keys(date = "2025-12-01", top = 10) ``` 4. **Disable verbose mode in loops**: Reduces output clutter ```{r tip-verbose} get_pix_keys(date = "2025-12-01", verbose = FALSE) ``` ## Filter Reference by Endpoint ### ChavesPix (PIX Keys) | Column | Type | Filter Example | |--------|------|----------------| | Nome | string | `"Nome eq 'BANCO DO BRASIL S.A.'"` | | ISPB | string | `"ISPB eq '00000000'"` | | NaturezaUsuario | string | `"NaturezaUsuario eq 'PF'"` or `"NaturezaUsuario eq 'PJ'"` | | TipoChave | string | `"TipoChave eq 'CPF'"`, `"TipoChave eq 'Celular'"`, `"TipoChave eq 'e-mail'"`, `"TipoChave eq 'Aleatória'"`, `"TipoChave eq 'CNPJ'"` | | qtdChaves | numeric | `"qtdChaves gt 1000"` | ### TransacoesPixPorMunicipio (Transactions by Municipality) | Column | Type | Filter Example | |--------|------|----------------| | Estado | string | `"Estado eq 'SÃO PAULO'"` | | Municipio | string | `"Municipio eq 'RECIFE'"` | | Sigla_Regiao | string | `"Sigla_Regiao eq 'NE'"` (NE, SE, S, CO, N) | | Regiao | string | `"Regiao eq 'NORDESTE'"` | | VL_PagadorPF | numeric | `"VL_PagadorPF gt 1000000"` | ### EstatisticasTransacoesPix (Transaction Statistics) | Column | Type | Filter Example | |--------|------|----------------| | PAG_PFPJ | string | `"PAG_PFPJ eq 'PF'"` | | REC_PFPJ | string | `"REC_PFPJ eq 'PJ'"` | | PAG_REGIAO | string | `"PAG_REGIAO eq 'SUDESTE'"` | | REC_REGIAO | string | `"REC_REGIAO eq 'NORDESTE'"` | | NATUREZA | string | `"NATUREZA eq 'P2P'"` (P2P, P2B, B2P, B2B, P2G, G2P) | | FORMAINICIACAO | string | `"FORMAINICIACAO eq 'DICT'"` (DICT, QRDN, QRES, MANU, INIC) | | FINALIDADE | string | `"FINALIDADE eq 'Pix'"` | | VALOR | numeric | `"VALOR gt 10000"` | | QUANTIDADE | numeric | `"QUANTIDADE ge 100"` | ## See Also - [Getting Started](pixr.html) - Package introduction - [Understanding PIX Data](understanding-pix-data.html) - Data structure details - [Data Analysis Examples](analysis-examples.html) - Practical use cases ## References - [OData Protocol Documentation](https://www.odata.org/documentation/) - [BCB API Swagger](https://olinda.bcb.gov.br/olinda/servico/Pix_DadosAbertos/versao/v1/swagger-ui3#/) - [BCB PIX Data Portal](https://dadosabertos.bcb.gov.br/dataset/pix)