curvefit
- UnitsAwareDataArray.curvefit(coords: str | DataArray | Iterable[str | DataArray], func: Callable[..., Any], reduce_dims: Dims = None, skipna: bool = True, p0: Mapping[str, float | DataArray] | None = None, bounds: Mapping[str, tuple[float | DataArray, float | DataArray]] | None = None, param_names: Sequence[str] | None = None, errors: ErrorOptions = 'raise', kwargs: dict[str, Any] | None = None) Dataset
Curve fitting optimization for arbitrary functions.
Wraps scipy.optimize.curve_fit with apply_ufunc.
- Parameters:
coords (Hashable, DataArray, or sequence of DataArray or Hashable) – Independent coordinate(s) over which to perform the curve fitting. Must share at least one dimension with the calling object. When fitting multi-dimensional functions, supply coords as a sequence in the same order as arguments in func. To fit along existing dimensions of the calling object, coords can also be specified as a str or sequence of strs.
func (callable) – User specified function in the form f(x, *params) which returns a numpy array of length len(x). params are the fittable parameters which are optimized by scipy curve_fit. x can also be specified as a sequence containing multiple coordinates, e.g. f((x0, x1), *params).
reduce_dims (str, Iterable of Hashable or None, optional) – Additional dimension(s) over which to aggregate while fitting. For example, calling ds.curvefit(coords=’time’, reduce_dims=[‘lat’, ‘lon’], …) will aggregate all lat and lon points and fit the specified function along the time dimension.
skipna (bool, default: True) – Whether to skip missing values when fitting. Default is True.
p0 (dict-like or None, optional) – Optional dictionary of parameter names to initial guesses passed to the curve_fit p0 arg. If the values are DataArrays, they will be appropriately broadcast to the coordinates of the array. If none or only some parameters are passed, the rest will be assigned initial values following the default scipy behavior.
bounds (dict-like, optional) – Optional dictionary of parameter names to tuples of bounding values passed to the curve_fit bounds arg. If any of the bounds are DataArrays, they will be appropriately broadcast to the coordinates of the array. If none or only some parameters are passed, the rest will be unbounded following the default scipy behavior.
param_names (sequence of Hashable or None, optional) – Sequence of names for the fittable parameters of func. If not supplied, this will be automatically determined by arguments of func. param_names should be manually supplied when fitting a function that takes a variable number of parameters.
errors ({"raise", "ignore"}, default: "raise") – If ‘raise’, any errors from the scipy.optimize_curve_fit optimization will raise an exception. If ‘ignore’, the coefficients and covariances for the coordinates where the fitting failed will be NaN.
**kwargs (optional) – Additional keyword arguments to passed to scipy curve_fit.
- Returns:
curvefit_results – A single dataset which contains:
- [var]_curvefit_coefficients
The coefficients of the best fit.
- [var]_curvefit_covariance
The covariance matrix of the coefficient estimates.
- Return type:
Examples
Generate some exponentially decaying data, where the decay constant and amplitude are different for different values of the coordinate
x
:>>> rng = np.random.default_rng(seed=0) >>> def exp_decay(t, time_constant, amplitude): ... return np.exp(-t / time_constant) * amplitude ... >>> t = np.arange(11) >>> da = xr.DataArray( ... np.stack( ... [ ... exp_decay(t, 1, 0.1), ... exp_decay(t, 2, 0.2), ... exp_decay(t, 3, 0.3), ... ] ... ) ... + rng.normal(size=(3, t.size)) * 0.01, ... coords={"x": [0, 1, 2], "time": t}, ... ) >>> da <xarray.DataArray (x: 3, time: 11)> Size: 264B array([[ 0.1012573 , 0.0354669 , 0.01993775, 0.00602771, -0.00352513, 0.00428975, 0.01328788, 0.009562 , -0.00700381, -0.01264187, -0.0062282 ], [ 0.20041326, 0.09805582, 0.07138797, 0.03216692, 0.01974438, 0.01097441, 0.00679441, 0.01015578, 0.01408826, 0.00093645, 0.01501222], [ 0.29334805, 0.21847449, 0.16305984, 0.11130396, 0.07164415, 0.04744543, 0.03602333, 0.03129354, 0.01074885, 0.01284436, 0.00910995]]) Coordinates: * x (x) int64 24B 0 1 2 * time (time) int64 88B 0 1 2 3 4 5 6 7 8 9 10
Fit the exponential decay function to the data along the
time
dimension:>>> fit_result = da.curvefit("time", exp_decay) >>> fit_result["curvefit_coefficients"].sel( ... param="time_constant" ... ) <xarray.DataArray 'curvefit_coefficients' (x: 3)> Size: 24B array([1.05692036, 1.73549638, 2.94215771]) Coordinates: * x (x) int64 24B 0 1 2 param <U13 52B 'time_constant' >>> fit_result["curvefit_coefficients"].sel(param="amplitude") <xarray.DataArray 'curvefit_coefficients' (x: 3)> Size: 24B array([0.1005489 , 0.19631423, 0.30003579]) Coordinates: * x (x) int64 24B 0 1 2 param <U13 52B 'amplitude'
An initial guess can also be given with the
p0
arg (although it does not make much of a difference in this simple example). To have a different guess for different coordinate points, the guess can be a DataArray. Here we use the same initial guess for the amplitude but different guesses for the time constant:>>> fit_result = da.curvefit( ... "time", ... exp_decay, ... p0={ ... "amplitude": 0.2, ... "time_constant": xr.DataArray([1, 2, 3], coords=[da.x]), ... }, ... ) >>> fit_result["curvefit_coefficients"].sel(param="time_constant") <xarray.DataArray 'curvefit_coefficients' (x: 3)> Size: 24B array([1.0569213 , 1.73550052, 2.94215733]) Coordinates: * x (x) int64 24B 0 1 2 param <U13 52B 'time_constant' >>> fit_result["curvefit_coefficients"].sel(param="amplitude") <xarray.DataArray 'curvefit_coefficients' (x: 3)> Size: 24B array([0.10054889, 0.1963141 , 0.3000358 ]) Coordinates: * x (x) int64 24B 0 1 2 param <U13 52B 'amplitude'
See also
DataArray.polyfit
,scipy.optimize.curve_fit