hgc.samples_frame module

The SamplesFrame class is an extended Pandas DataFrame, offering additional methods for validation of hydrochemical data, calculation of relevant ratios and classifications.

class hgc.samples_frame.SamplesFrame(pandas_obj)

Bases: object

DataFrame with additional hydrochemistry-specific methods. All HGC methods and attributes defined in this class are available in the namespace ‘hgc’ of the Dataframe.

Examples

To use HGC methods, we always start from a Pandas DataFrame:

import pandas as pd
import hgc

# We start off with an ordinary DataFrame
df = pd.DataFrame({'Cl': [1,2,3], 'Mg': [11,12,13]})

# Since we imported hgc, the HGC-methods become available
# on the DataFrame. This allows for instance to use HGC's
# validation function
df.hgc.is_valid
False
df.hgc.make_valid()
allowed_hgc_columns

Returns allowed columns of the hgc SamplesFrame

consolidate(use_ph='field', use_ec='lab', use_so4='ic', use_o2='field', use_temp='field', use_alkalinity='alkalinity', merge_on_na=False, inplace=True)

Consolidate parameters measured with different methods to one single parameter.

Parameters such as EC and pH are frequently measured both in the lab and field, and SO4 and PO4 are frequently measured both by IC and ICP-OES. Normally we prefer the field data for EC and pH, but ill calibrated sensors or tough field circumstances may prevent these readings to be superior to the lab measurement. This method allows for quick selection of the preferred measurement method for each parameter and select that for further analysis.

For each consolidated parameter HGC adds a new column that is either filled with the lab measurements or the field measurements. It is also possible to fill it with the preferred method, and fill remaining NaN’s with measurements gathered with the other possible method.

Parameters:
  • use_ph ({'lab', 'field', None}, default 'field') – Which pH to use? Ignored if None.
  • use_ec ({'lab', 'field', None}, default 'lab') – Which EC to use?
  • use_so4 ({'ic', 'field', None}, default 'ic') – Which SO4 to use?
  • use_o2 ({'lab', 'field', None}, default 'field') – Which O2 to use?
  • use_alkalinity (str, default 'alkalinity') – name of the column to use for alkalinity
  • merge_on_na (bool, default False) – Fill NaN’s from one measurement method with measurements from other method.
  • inplace (bool, default True) – Modify SamplesFrame in place. inplace=False is not implemented (yet)
Raises:

ValueError: if one of the `use_` parameters is set to a column that is not in the dataframeor if one of the default parameters is not in the dataframe while it is not set to None.

fillna_concentrations(how='phreeqc')

Calculate missing concentrations based on the charge balance.

Parameters:how ({'phreeqc', 'analytic'}, default 'phreeqc') – Method to compute missing concentrations.
fillna_ec(use_phreeqc=True)

Calculate missing Electrical Conductivity measurements using known anions and cations.

get_bex(watertype='G', inplace=True)

Get Base Exchange Index (meq/L). By default this is the BEX without dolomite.

Parameters:
  • watertype ({'G', 'P'}, default 'G') – Watertype (Groundwater or Precipitation)
  • inplace (bool, optional, default True) – whether the saturation index should be added to the pd.DataFrame (inplace=True) as column si_<mineral_name> or returned as a pd.Series (inplace=False).
Returns:

Returns None if inplace=True or pd.Series with base exchange index for each row in SamplesFrame if inplace=False.

Return type:

pandas.Series or None

get_dominant_anions(inplace=True)

calculates the dominant anion of each row in the SamplesFrame as used by the Stuyfzand water type classification ( See: http://www.hydrology-amsterdam.nl/valorisation/HGCmanual_v2_1.pdf chapter 5 for the definitions.)

Parameters:inplace (bool, optional, default True) – whether the dominant anion should be added to the pd.DataFrame as column dominant_anion (inplace=True) or returned as a pd.Series (inplace=False).
Returns:Returns None if inplace=True or pd.Series with dominant anion for each row in SamplesFrame if inplace=False.
Return type:pandas.Series or None
get_dominant_cations(*args, **kwargs)
get_ion_balance(inplace=True)

Calculate the balance between anion and cations and add it as a percentage [%] to the column ‘ion_balance’ to the SamplesFrame

Parameters:inplace (bool, optional, default True) – whether the ion balance should be added to the SamplesFrame (inplace=True) as column ion_balance or returned as a pd.Series (inplace=False).
Returns:Returns None if inplace=True or pd.Series with ion balance for each row in SamplesFrame if inplace=False.
Return type:pandas.Series or None
get_partial_pressure(gas, use_phreeqc=True, inplace=True, **kwargs)

adds or returns the partial pressure of a gas using phreeqc. It is an alias for get_saturation_index so look at that method for details. gas column is pp_<gas_name>

get_phreeqpython_solutions(equilibrate_with='none', inplace=True)

Return a series of phreeqpython solutions derived from the (row)data in the SamplesFrame.

Parameters:
  • equilibrate_with (str, default 'none') – Ion to add for achieving charge equilibrium in the solutions.
  • inplace (bool, default True) – Whether the result is returned as a pd.Series or is added to the pd.DataFrame as column pp_solutions.
Returns:

Returns None if inplace=True and pd.Series with PhreeqPython.Solution instances for every row in SamplesFrame if inplace=False.

Return type:

pandas.Series or None

get_ratios(*args, **kwargs)
get_saturation_index(mineral_or_gas, use_phreeqc=True, inplace=True, **kwargs)
adds or returns the saturation index (SI) of a mineral or the partial pressure of a gas using phreeqc. The
column name of the result is si_<mineral_name> in lower case (if inplace=True).
Parameters:
  • mineral_or_gas (str) – the name of the mineral of which the SI needs to be calculated
  • use_phreeqc (bool) –
    whether to return use phreeqc as backend or fall back on internal hgc-routines to calculate SI
    or partial pressure
    inplace: bool, optional, default=True
    whether the saturation index should be added to the pd.DataFrame (inplace=True) as column si_<mineral_name> or returned as a pd.Series (inplace=False).
    returns:Returns None if inplace=True and pd.Series with the saturation index of the mineral for each row in SamplesFrame if inplace=False.
    rtype:pandas.Series or None
get_specific_conductance(use_phreeqc=True, inplace=True, **kwargs)

returns the specific conductance (sc) of a water sample using phreeqc. sc is also known as electric conductivity (ec) or egv measurements.

Parameters:
  • use_phreeqc (bool, optional) – whether to return use phreeqc as backend or fall back on internal hgc-routines to calculate SI or partial pressure
  • inplace (bool, optional, default=True) – whether the specific conductance should be added to the pd.DataFrame (inplace=True) as column sc or returned as a pd.Series (inplace=False).
  • **kwargs – are passed to the method get_phreeqpython_solutions
Returns:

Returns None if inplace=True and pd.Series with specific conductance for each row in SamplesFrame if inplace=False.

Return type:

pandas.Series or None

get_stuyfzand_water_type(*args, **kwargs)
get_sum_anions(*args, **kwargs)
get_sum_cations(*args, **kwargs)
hgc_cols

Return the columns that are used by hgc

is_valid

returns a boolean indicating that the columns used by hgc have valid values

make_valid()

Try to convert the DataFrame into a valid HGC-SamplesFrame.

select_phreeq_columns(*args, **kwargs)
hgc.samples_frame.requires_ph(func)

Decorator function for methods in the SamplesFrame class that require a column ph with valid values (non-zero and non-NaN).