Category: BLOG

Why would you use Regularization and what it is? In Machine Learning, very often the task is to fit a model to a set of training data and use the fitted model to make predictions or classify new (out of sample) data points. Sometimes model fits the training data very well but does not well in predicting out of sample data points. A model may…

Read More

Introduction to TensorFlow. What is TensorFlow? The shortest definition would be, TensorFlow is a general-purpose library for graph-based computation. But there is a variety of other ways to define TensorFlow, for example, Rodolfo Bonnin in his book – Building Machine Learning Projects with TensorFlow brings up definition like this: “TensorFlow is an open source software library for numerical computation using data flow graphs. Nodes in the…

Read More

Data Science News Digest – handpicked articles, news, and stories from Data Science world.   NEWS   CUDA 9 Features Revealed  – At the GPU Technology Conference, NVIDIA announced CUDA 9, the latest version of CUDA’s powerful parallel computing platform and programming model.   Explaining How End-to-End Deep LearninSteers a Self-Driving Car – As part of a complete software stack for autonomous driving, NVIDIA has created a deep-learning-based…

Read More

Below a list of free resources to learn TensorFlow: TensorFlow website: www.tensorflow.org Udacity free course: www.udacity.com Google Cloud Platform: cloud.google.com Coursera free course: www.coursera.org Machine Learning with TensorFlow by Nishant Shukla : www.tensorflowbook.com ‘First Contact With TensorFlow’ by Prof. JORDI TORRES: jorditorres.org  or you can order from Amazon: First Contact With Tensorflow Kadenze Academy: www.kadenze.com OpenShift: blog.openshift.com Tutorial by pkmital : github.com Tutorial by HyunsuLee : github.com Tutorial by orcaman : github.com Stanford…

Read More

TensorFlow Quick Reference Table – Cheat Sheet. TensorFlow is very popular deep learning library, with its complexity can be overwhelming especially for new users. Here is a short summary of often used functions, if you want to download it in pdf it is available here: TensorFlow CheetSheet – SecretDataScientist.com If you find it useful please share on social media. Import TensorFlow: import tensorflow as tf…

Read More

What is TensorFlow? TensorFlow is an open source software library for machine learning developed by Google –  Google Brain team. Name TensorFlow derives from the operations which neural networks perform on multidimensional data arrays, often referred to as “tensors”. It is using data flow graphs, and is capable of building and training variety of different machine learning algorithms and deep neural networks, but it is…

Read More

Popular Pandas snippets used in data analysis. Pandas is very popular Python library for data analysis, manipulation, and visualization, I would like to share my personal view on the list of most often used functions/snippets for data analysis. 1.Import Pandas to Python import pandas as pd 2. Import data from CSV/Excel file df=pd.read_csv(‘C:/Folder/mlhype.csv’)   #imports whole csv to pd dataframe df = pd.read_csv(‘C:/Folder/mlhype.csv’, usecols=[‘abv’, ‘ibu’])…

Read More

Hadoop YARN is the architectural center of Hadoop that allows multiple data processing engines such as interactive SQL, real-time streaming, data science and batch processing to handle data stored on a single platform, unlocking an entirely new approach to analytics. YARN is the foundation of the new generation of Hadoop and is enabling organizations everywhere to realize a modern data architecture. YARN also extends the…

Read More

Hadoop Flume was created in the course of incubator Apache project to allow you to flow data from a source into your Hadoop environment. In Flume, the entities you work with are called sources, decorators, and sinks. A source can be any data source, and Flume has many predefined source adapters. A sink is the target of a specific operation (and in Flume, among other…

Read More

Apache Kafka is an open-source stream processing platform developed by the Apache Software Foundation written in Scala and Java. The project aims to provide a unified, high-throughput, low-latency platform for handling real-time data feeds. Its storage layer is essentially a “massively scalable pub/sub message queue architected as a distributed transaction log, making it highly valuable for enterprise infrastructures to process streaming data. Additionally, Kafka connects…

Read More
Show Buttons
Share On Facebook
Share On Twitter
Share On Google Plus
Share On Linkedin
Share On Pinterest
Share On Reddit
Hide Buttons