regpyhdfe package
Submodules
regpyhdfe.regpyhdfe module
- class regpyhdfe.regpyhdfe.Regpyhdfe(df, target, predictors, absorb_ids=[], cluster_ids=[], drop_singletons=True, intercept=False)[source]
Bases:
object
- __init__(df, target, predictors, absorb_ids=[], cluster_ids=[], drop_singletons=True, intercept=False)[source]
Regression wrapper for PyHDFE.
- Parameters:
df (pandas Dataframe) – dataframe containing referenced data which includes target, predictors and absorb and cluster.
target (string) – name of target variable - the y in y = X*b + e.
predictors (string or list of strings) – names of predictors, the X in y = X*b + e.
absorb_ids (string or list of strings) – names of variables to be absorbed for fixed effects.
cluster_ids (string or list of strings) – names of variables to be clustered on.
drop_singletons (bool) – indicates whether to drop singleton groups. Defaults is True, same as stata. Setting to False is equivalent to passing keepsingletons to reghdfe.
- regpyhdfe.regpyhdfe.summary(self, regpyhdfe, yname=None, xname=None, title=None, alpha=0.05)[source]
Summarize the Regression Results.
- Parameters:
yname (str, optional) – Name of endogenous (response) variable. The Default is y.
xname (list[str], optional) – Names for the exogenous variables. Default is var_## for ## in the number of regressors. Must match the number of parameters in the model.
title (str, optional) – Title for the top table. If not None, then this replaces the default title.
alpha (float) – The significance level for the confidence intervals.
- Returns:
Instance holding the summary tables and text, which can be printed or converted to various output formats.
- Return type:
Summary
See also
statsmodels.iolib.summary.Summary
A class that holds summary results.
regpyhdfe.utils module
- regpyhdfe.utils.add_intercept(X)[source]
Prepends a column of 1s (an intercept column) to a a 2D numpy array.
- Parameters:
X (numpy array) – 2D numpy array.
- Returns:
X with an appended column of 1s.
- regpyhdfe.utils.get_np_columns(df, columns, intercept=False)[source]
Helper used to retreive columns as numpy array.
- Parameters:
df (pandas dataframe) – dataframe containing desired columns
columns (list of strings) – list of names of desired columns. Must be a list even if only 1 column is desired.
intercept (bool) – set to True if You’d like resulting numpy array to have a column of 1s appended to it.
- Returns:
2D numpy array with columns of array consisting of feature vectors, i.e. the first column of the result is a numpy vector of the first column named in columns argument.
- regpyhdfe.utils.sklearn_to_df(sklearn_dataset)[source]
Converts (as well as it can) an sklearn dataset to a Pandas dataframe.
- Parameters:
sklearn_dataset (sklearn.utils.Bunch) – this parameter is usually the result of using sklearn to quickly get a dataset, e.g. the object resulting from calling sklearn.load_datasets.load_boston().
- Returns:
Pandas dataframe df where df[‘target’] is the target variable in the original dataset.