Retrieve Batched Data from BioMart with Filters and Attribute Chunks
biomaRt_getBM_batch.RdThis function retrieves data from BioMart in batches by splitting attributes into smaller chunks. It fetches data using specified filters and joins the results sequentially, with optional filtering for protein-coding genes and standard chromosomes.
Arguments
- mart
A
Martobject created withuseMart, representing the BioMart dataset connection.- attributes
A character vector of attributes to retrieve from BioMart.
- filters
A single attribute (string) used as a filter in the BioMart query, often the unique identifier.
- values
A vector of values to match with the specified
filtersattribute.- chunk_size
An integer defining the number of attributes per batch request.
Value
A data frame containing the retrieved and filtered data, with each row representing a unique entity and columns corresponding to the requested attributes.
Details
This function processes the retrieval in chunks, making sequential calls to BioMart and avoiding large single queries.
The resulting data is filtered to include only protein-coding genes and standard chromosomes (1–22, X, Y).
If an attribute specified as filters is also in attributes, it will be removed from attributes to prevent duplication.
Examples
if (FALSE) { # \dontrun{
library(biomaRt)
mart <- useMart("ensembl", dataset = "hsapiens_gene_ensembl")
attributes <- c("chromosome_name", "start_position", "end_position", "hgnc_symbol")
filters <- "hgnc_symbol"
values <- c("BRCA1", "TP53", "EGFR")
chunk_size <- 2
result <- biomaRt_getBM_batch(mart, attributes, filters, values, chunk_size)
} # }