Wednesday, April 20, 2022

Data Science vs. Business Analytics


Key Differences Between Data Science and Business Analysis:

Here are some of the key differences between data scientists and business analysts.

1. Data science is the science of studying data using statistics, algorithms and technologies, and business analysis is the statistical study of business data.

2. Data science is a relatively recent development in analytics, but business analytics has existed since the late 19th century.

3 Data science requires a lot of programming skills, but business analysis doesn't require a lot of programming.

4. Data science is an important subset of business analysis. Therefore, anyone with data science skills can do business analysis, but not vice versa.

5. Taking data science one step ahead of business analysis is a luxury. However, business analysis is needed for companies to understand how it works and gain insights.

6. Analytical Data Science results cannot be used for everyday business decision making, but business analysis is essential for critical administrative decision making.

7. Data science does not answer obvious questions. Questions are almost common. However, business analysis mainly answers very specific questions about finance and business.

8. Data science can answer questions that can be used for business analysis, but not the other way around.

9. Data science uses both structured and unstructured data, while business analytics primarily uses structured data.

10. Data science has the potential to make a big leap, especially with the advent of machine learning and artificial intelligence, while business analysis is still slow.

11. Unlike business analysts, data scientists don't come across a lot of dirty data.

12. In contrast to business analysis, data science relies heavily on data availability.

13. Investing in data science The cost of is high and business analysis is low.

14. Data science can keep up with today's data. Data is growing and diverging into many data types. Data scientists have the necessary skills to handle it. However, commercial analysts do not own it.


Data Science and Business Analytics Comparison Table

Below is the comparison table between Data Scientist and Business Analytics.

Comparison base

Data Science

Business Analytics

Coining of Term

In 2008, DJ Patil and Jeff Hammerbacher from LinkedIn and Facebook, respectively, invented the term Data Scientist.

Since Frederick Winslow Taylor's implementation in the late 1800s, business analytics has been in use.

Concept

Data inference, algorithm development, and data-driven systems are all interdisciplinary fields.

To derive insights from business data, statistical principles are used. 

Application-Top 5 Industries

·         Technology

·         Financial

·         Mix of fields

·         Internet-based

·         Academic

·         Financial

·         Technology

·         Mix of fields

·         CRM/Marketing

·         Retail

Coding

Coding is needed. Traditional analytics approaches are combined with a solid understanding of computer science in this subject.

There isn't a lot of coding involved. Statistically orientated.

Languages Recommendations

C/C++/C#, Haskell, Java, Julia, Matlab, Python, R, SAS, Scala, SQL

C/C++/C#, Java, Matlab, Python, R SAS, Scala, SQL

Statistics

Following the creation and coding of algorithms, statistics is used at the end of the analysis.

The entire investigation is based on statistical principles.

Work Challenges

·         • Business decision-makers do not employ data science results.

·         • Inability to adapt results to the decision-making process of the company.

·         • There is a lack of clarity about the questions that must be answered with the data set provided.

·         • Data is unavailable or difficult to obtain.

·         • IT needs to be consulted.

·         • There is a notable lack of domain expert involvement.

·         • Unavailability of/difficult access to data 

·         • Dirty data

·         • Concerns about privacy

·         • Insufficient finances to purchase meaningful data sets from outside sources.

·         • Inability to adapt results to the decision-making process of the company.

·         • There is a lack of clarity about the questions that must be answered with the data set provided.

·         • Tools have limitations.

·         • IT needs to be consulted.

Data Needed

Both structured and unstructured data.

Predominantly structured data.

Future Trends

Machine Learning and Artificial Intelligence

Cognitive Analytics, Tax Analytics

Data Structures in Data Science using Python


What is a Data Structure?

To make data manipulation and other data operations more efficient, a data structure is used to store data in an ordered manner.


Types of Data Structures:

1. Vector- It is a homogeneous data structure and one of the most basic data structures. It only contains components of the same data type, in other words. Numeric, integer, character, complex, and logical data types are possible.

How to Create a Vector in Python:
In Python, use the np.array( ) function to create a vector.
# Vector as row
vec_row = np.array([1, 2, 3])
vector_row
#Vector as column
vec_column = np.array([[1],
[2],
[3]])
vector_column


2. Matrix- A matrix is a two-dimensional data structure with a homogeneous structure. This signifies that only items of the same data type are accepted. When elements of various data types are transmitted, coercion occurs.

How to Create a Matrix in Python:
Python uses the np.mat( ) function to create a matrix.
matrix = np.mat([[1, 2],
[1, 2],
[1, 2]])
matrix

 
3. Array- They're data structures with several dimensions. Data is kept in an array in the form of matrices, rows, and columns. The matrix elements can be accessed using the matrix level, row index, and column index.

How to Create an Array in Python:
In Python, use square brackets to create arrays.
cars = ["Tata", "Maruti", "Mahindra"]
cars

 
4. Series- It's only available in Python, especially when using the Pandas package. It's a one-dimensional labeled array that can hold any data (integer, string, float, python objects, etc.). The axis labels are referred to as the 'index'.

How to Create a Series in Python:
first create an array using the array( ) function. Then feed the array as an input into the series using the Series( ) function. a = np.array(['H', 'a', 'n', 'u', 'm', 'a', 'n'])
s = pd.Series(a)
s

 
5. Data Frame- A data frame is a two-dimensional array with a table-like appearance. Each row includes one set of values from each column, and each column contains one set of values. A data frame can contain numeric, factor, or character data. The number of data items in each column should be the same.

How to Create a Data Frame in Python: A data frame is a collection of series in Python. The data-frame is created with the pandas package. To make the data frame, use the DataFrame function.
#Dataframe
cars = pd.read_csv("C:/cars.csv")
df = pd.DataFrame(cars)
df


6. List- Lists can contain elements of various sorts, such as numbers, texts, vectors, and other lists. A matrix or a function can be one of the members of a list. It is an ordered and changeable collection (it can be changed). It may have duplicate values.

How to Create a List in Python:
Creating a variable, opening a square bracket, and inputting the desired values are all it takes.
n = ["Red", "Radha", (21,32,11), True, 51.23]
n


7. Dictionary- It's also known as a hash map, and it accepts arbitrary keys and values. Numbers, numeric vectors, strings, and string vectors can all be used as keys. It's a changeable, indexed, and unordered database. It can't have any duplicate values in it.

How to Create a Dictionary in Python:
Open a curly bracket, enter the values, and specify the key.
dict = {1: [1, 2, 3, 4], 'Name': 'Krishna'}
dict


8. Tuple- Python is the only language that has it. It is made up of organized and unchangeable elements. A tuple can contain any number of items, including various types (integer, float, list, string, etc.). There are duplicate members in this group.

How to Create a Tuple in Python: Make a variable, open parenthesis, and fill in the values.
tuple1 = ("Banana",1, False)
print(tuple1)


References:
1. https://medium.com/@vinitasilaparasetty/data-structures-in-data-science-4f47d9c4ab94

Search Aptipedia