What is Feature in machine learning?

Feature in machine learning and pattern recognition is an individual measurable property of a phenomenon being observed. Choosing informative, discriminating and independent features is a crucial step for effective algorithms in pattern recognition, classification, and regression. Features are usually numeric, but structural features such as strings and graphs are used in syntactic pattern recognition. The concept of “feature” is related … Read more

What are False Positives?

False positives commonly called a “false alarm”, is a result that indicates a given condition has been fulfilled when it has not. I.e. erroneously a positive effect has been assumed. In the case of “crying wolf” – the condition tested for was “is there a wolf near the herd?”; the result was that there had not been a wolf near … Read more

What is Explanatory Data Analysis?

Explanatory Data Analysis (EDA) in statistics is an approach to analyzing data sets to summarize their main characteristics, often with visual methods. A statistical model can be used or not, but primarily EDA is for seeing what the data can tell us beyond the formal modeling or hypothesis testing task. Exploratory data analysis was promoted to encourage statisticians to explore … Read more

What is Euclidean Distance?

Euclidean distance in mathematics is the “ordinary” (i.e. straight-line) distance between two points in Euclidean space. With this distance, Euclidean space becomes a metric space. The associated norm is called the Euclidean norm. Older literature refers to the metric as a Pythagorean metric. A generalized term for the Euclidean norm is the L2 norm or L2 distance. The Euclidean distance … Read more

What is Cross-Validation?

Cross-Validation is a model validation technique for assessing how the results of a statistical analysis will generalize to an independent data set. It is mainly used in settings where the goal is a prediction, and one wants to estimate how accurately a predictive model will perform in practice. The idea is to define a dataset to “test” the model in … Read more

What is Boltzmann Machine?

Boltzmann machine is a network of symmetrically connected, neuronlike units that make stochastic decisions about whether to be on or off. Boltzmann machines have a simple learning algorithm that allows them to discover interesting features in datasets composed of binary vectors. The learning algorithm is very slow in networks with many layers of feature detectors, but it can be made … Read more

What is Big Data?

Big data is a term for data sets that are so large or complex that traditional data processing application software is inadequate to deal with them. Challenges include capture, storage, analysis, data curation, search, sharing, transfer, visualization, querying, updating and information privacy. The term “big data” often refers simply to the use of predictive analytics, user behavior analytics, or certain … Read more

What is Bias-variance trade-off

Bias-variance trade-off is a central problem in supervised learning. Ideally, one wants to choose a model that both accurately captures the regularities in its training data, but also generalizes well to unseen data. In statistics and machine learning bias-variance trade-off is the problem of simultaneously minimizing two sources of error that prevent supervised learning algorithms from generalizing beyond their training … Read more

What is Bayesian statistics?

Bayesian statistics is a theory in the field of statistics in which the evidence about the true state of the world is expressed in terms of degrees of belief known as Bayesian probabilities. Such an interpretation is only one of a number of interpretations of probability and there are other statistical techniques that are not based on ‘degrees of belief’. … Read more

What is backpropagation?

Backpropagation or the backward propagation of errors is a common method of training artificial neural networks and used in conjunction with an optimization method such as gradient descent. The algorithm repeats a two-phase cycle, propagation, and weight update. When an input vector is presented to the network, it is propagated forward through the network, layer by layer, until it reaches … Read more