Class DataFrame

java.lang.Object
com.leumanuel.woozydata.model.DataFrame

public class DataFrame extends Object
Core class for data manipulation and analysis. Provides fluent interface for common data operations.
Version:
1.0
Author:
Leu A. Manuel
  • Constructor Details

    • DataFrame

      public DataFrame(List<Map<String,Object>> data)
      Creates a new DataFrame with the given data.
      Parameters:
      data - List of maps representing tabular data
  • Method Details

    • clean

      public DataFrame clean()
      Performs automatic data cleaning operations. Includes null removal, duplicate removal, and type fixing.
      Returns:
      this DataFrame for method chaining
    • dropNull

      public DataFrame dropNull()
      Removes rows containing null values.
      Returns:
      this DataFrame for method chaining
    • dropDupes

      public DataFrame dropDupes()
      Removes duplicate rows from the DataFrame.
      Returns:
      this DataFrame for method chaining
    • fixTypes

      public DataFrame fixTypes()
      Automatically converts data types based on content.
      Returns:
      this DataFrame for method chaining
    • fill

      public DataFrame fill(Object value)
      Fills null values with a specified value.
      Parameters:
      value - Value to replace nulls with
      Returns:
      this DataFrame for method chaining
    • fillNa

      public DataFrame fillNa(Object value, String... columns)
      Fills null values in specified columns.
      Parameters:
      value - Value to replace nulls with
      columns - Columns to fill
      Returns:
      this DataFrame for method chaining
    • rank

      public DataFrame rank(String... columns)
      Ranks (sorts) the DataFrame based on specified columns. The ranking is done in descending order, with null values treated as lowest values. For large datasets (over 1000 rows), parallel processing is used for better performance.
      Parameters:
      columns - Columns to use for ranking, in order of priority
      Returns:
      this DataFrame for method chaining
      Throws:
      IllegalArgumentException - if no columns are specified
    • select

      public DataFrame select(String... columns)
      Creates a new DataFrame containing only the specified columns. Maintains the original row order but includes only the selected columns. If a specified column doesn't exist, it will be ignored.
      Parameters:
      columns - Names of columns to select
      Returns:
      new DataFrame containing only the selected columns
      Throws:
      IllegalArgumentException - if no columns are specified
    • show

      public void show(int limit)
      Displays the first n rows of the DataFrame. Useful for quickly inspecting the data content.
      Parameters:
      limit - Number of rows to display
      Throws:
      IllegalArgumentException - if limit is negative
    • getData

      public List<Map<String,Object>> getData()
      Returns the underlying data structure of the DataFrame. Each element in the list represents a row, with column names mapped to values. The returned list is a direct reference to the DataFrame's data.

      Note: Modifying the returned list will affect the DataFrame's content. For a safe copy, clone the data before modifying.

      Returns:
      List of Maps containing the DataFrame's data