Trawl a local CRAN archive and extract statistics from all packages

Description

Trawl a local CRAN archive and extract statistics from all packages

Usage

pkgstats_from_archive(
  path,
  archive = TRUE,
  prev_results = NULL,
  results_file = NULL,
  chunk_size = 1000L,
  num_cores = 1L,
  save_full = FALSE,
  save_ex_calls = FALSE,
  results_path = fs::path_temp()
)

Arguments

  • path: Path to local archive of R packages, either as source directories, or ‘.tar.gz’ files such as in a CRAN mirror.

  • archive: If TRUE, extract statistics for all packages in the /Archive sub-directory, otherwise only statistics for main directory (that is, current packages only).

  • prev_results: Result of previous call to this function, if available. Submitting previous results will ensure that only newer packages not present in previous result will be analysed, with new results simply appended to previous results. This parameter can also specify a file to be read with readRDS().

  • results_file: Can be used to specify the name or full path of a .Rds file to which results should be saved once they have been generated. The ‘.Rds’ extension will be automatically appended, and any other extensions will be ignored.

  • chunk_size: Divide large archive trawl into chunks of this size, and save intermediate results to local files. These intermediate files can be combined to generate a single prev_results file, to enable jobs to be stopped and re-started without having to recalculate all results. These files will be named pkgstats-results-N.Rds, where “N” incrementally numbers each file.

  • num_cores: Number of machine cores to use in parallel, defaulting to single-core processing.

  • save_full: If TRUE, full pkgstats results are saved for each package to files in results_path.

  • save_ex_calls: If TRUE, the results of the external_calls component are saved for each package to files in results_path (only if save_full = FALSE).

  • results_path: Path to save intermediate files generated by the chunk_size parameter described above.

Seealso

Other archive: [dl_pkgstats_data](dl_pkgstats_data)(), [pkgstats_cran_current_from_full](pkgstats_cran_current_from_full)(), [pkgstats_fns_from_archive](pkgstats_fns_from_archive)(), [pkgstats_fns_update](pkgstats_fns_update)(), [pkgstats_update](pkgstats_update)()

Concept

archive

Value

A data.frame object with one row for each package containing summary statistics generated from the pkgstats_summary function.

Examples

# Create fake archive directory with single tarball:
f <- system.file ("extdata", "pkgstats_9.9.tar.gz", package = "pkgstats")
tarball <- basename (f)

archive_path <- file.path (tempdir (), "archive")
if (!dir.exists (archive_path)) {
    dir.create (archive_path)
}
path <- file.path (archive_path, tarball)
file.copy (f, path)
tarball_path <- file.path (archive_path, "tarballs")
dir.create (tarball_path, recursive = TRUE)
file.copy (path, file.path (tarball_path, tarball))
out <- pkgstats_from_archive (tarball_path)