• Stars
    star
    180
  • Rank 213,097 (Top 5 %)
  • Language
    R
  • Created almost 7 years ago
  • Updated over 2 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

polyreg, an Alternative to Machine Learning Methods

A package to automate formation and evaluation of multivariate polynomial regression models, especially as an alternative to neural networks and other machine learning algorithms.

An important feature is that dummy variables are handled properly, so that for instance powers of a dummy variable do not exist as duplicates of the original.

Note: This library is used in the qeML package; qeML ("quick and easy machine learning") provides a convenient, consistent interface to various machine learning algorithms, including polynomial regression via polyreg. There is also a polynomial version of ridge regression. Other than special purposes, it is recommended that the user try the qeMLinterface, rather than using polyreg directly.

Motivation

In Polynomial Regression As an Alternative to Neural Nets, by Cheng, Khomtchouk, Matloff and Mohanty, 2018, it is argued that dense, feedforward neural networks are essentially polynomial regression models. This was extended in Towards a Mathematical Framework to Inform Neural Network Modelling via Polynomial Regression. by Morala, Cifuentes, Lillo, and IΓ±aki Ucar. The point is then, why go through the problems of neural networks--convergence, local minima and so on--when can can work more simply with polynomials?

Of course, it is not quite that simple. If we start with p variables in our model, the d-degree polynomial version will have O(pd) variables, which can easily become computationally challenging. Nevertheless, our experiments have had quite encouraging results.

Usage

The main functions are polyfit() and predict.polyFit(). One can fit either regression or classification models.

Example

Programmer/engineer 2000 Census data, Silicon Valley, built-in to the package.

data(pef)

# model wage income, fitting a degree-2 model
pfout <- polyFit(pef[,c(1,2,3,4,6,5)],2,use='lm')

# predict wage of person like that in row 1, but age 40 and female
newx <- pef[1,-5]
newx$age <- 48
newx$sex <- 1
predict(pfout,newx)  # about $84,330

More Repositories

1

fasteR

Fast Lane to Learning R!
R
967
star
2

TidyverseSkeptic

An opinionated view of the Tidyverse "dialect" of the R language.
TeX
512
star
3

R-vs.-Python-for-Data-Science

429
star
4

regtools

Various tools for linear, nonlinear and nonparametric regression.
R
124
star
5

fastStat

Quick introduction to statistics for those with a probability background.
74
star
6

partools

Tools to aid coding in the R 'parallel' package.
R
40
star
7

qeML

R
40
star
8

revisit

R
38
star
9

probstatbook

Open source textbook in probability and statistics.
TeX
34
star
10

worthycsds

31
star
11

ArtOfML

Companion to "The Art of Machine Learning
R
25
star
12

FarewellAddress

22
star
13

omsi

Python
21
star
14

R-Style-Guide

Not about making your code "pretty"! Our goal here is to make code that has fewer bugs and is easier to maintain and extend.
20
star
15

cdparcoord

Frequency-based parallel coordinates plots for categorical and discrete data.
R
18
star
16

EthicsForTech

Readings for a course in ethics for technologists.
15
star
17

rectools

R
12
star
18

toweranNA

Implementation of the Tower Method, a novel approach to handling missing values.
R
12
star
19

prVis

R
11
star
20

rcurses

Access to the Unix 'curses' library from R.
R
11
star
21

des

Discrete-Event Simulation in R
R
11
star
22

parcoordtutorial

Tutorial on the parallel coordinates visualization method. Examples, interpretation, data, links and more.
R
10
star
23

fastLinearAlgebra

Quick review of linear algebra. Some facility with R helpful but not required.
R
7
star
24

cmdlinetools

Handy tools to make like easier and more fun with the R command line!
R
5
star
25

Rth

C++
5
star
26

dsld

A statistical and graphical toolkit for analyzing data for possible patterns of discrimination (racial, gender, age, etc.)
R
5
star
27

EDFfair

Explicitly Deweighted Features, for Fair ML
R
5
star
28

dbgR

Debugging tools for R.
R
4
star
29

imagefraud

R
4
star
30

debugR

R
4
star
31

nmGeneralCourseInfo

General procedures.
3
star
32

statdb

TeX
3
star
33

recsysCourse

TeX
3
star
34

Rmisc

Miscellaneous R utilities and tutorials
3
star
35

dsldBook

A textbook on the use of quantitative methods related to discrimination in race, gender and so on.
HTML
3
star
36

pydsm

Python true-shared memory parallel computation/
Python
3
star
37

AutoGrading

Scripts to automate grading
R
2
star
38

polyanNA

Novel methods for handling missing values in prediction contexts.
R
2
star
39

fastBigStat

Fast introduction to large-sample methods in statistics
2
star
40

changeS

R
2
star
41

ksREPL

Keyboard shorts for the R command line.
R
2
star
42

ShinyImage

Imaging package, with an emphasis on recording history of changes.
R
2
star
43

VimRCs

Various Vim startup files
1
star
44

WAMfair

R
1
star
45

gitR

R
1
star
46

probstatbook256

Like repo 'probstatbook' but older and with more advanced topics, intended for ECS 256.
TeX
1
star
47

misc

Miscellaneous files, or ones here temporarily until I make a new repository.
R
1
star
48

imageClassR

R
1
star
49

Rdsm

R
1
star
50

edtdbg

Quasi-IDE for R
R
1
star
51

exNorm

Tools for the ex-normal distribution family.
R
1
star
52

UnfairButFairML

In the field of fair machine learning, it is presumed that fair analyses should always omit, or at least reduce in influence, sensitive variables such as race and gender. But in some applications, those affected may actually want their sensitive traits to be used.
R
1
star