Python to learn data mining, do you want to do math?

Updated on educate 2024-03-06
4 answers
  1. Anonymous users2024-02-06

    It is recommended that you learn a little math. Whether it is classification, clustering, regression, recommendation, etc., various algorithms must have a mathematical foundation to be able to understand, a little mathematical foundation, you can also be very confident in the interpretation of results, although python many packages can be ported, and the results can be released, but if it is accurate, you still need to def it yourself, so if you want to do well in this industry, mathematics cannot be said to be too good, but at least not too bad.

    The relationship between data mining and mathematics in python is as follows:

    1.Data mining is not intended to replace traditional statistical analysis techniques. Rather, it is an extension and extension of statistical analysis methodology.

    Most of the statistical analysis techniques are based on sound mathematical theories and superb skills, and the accuracy is satisfactory, but the requirements for the user are high. And with the increasing power of computers, it is possible to use the powerful computing power of computers to accomplish the same function only in relatively simple and fixed ways.

    2.On the basis of the file system: because we all know that the database management system (DBMS) of the database system is established to the current problem of data mining and statistics, some data mining algorithms are originally statistical methods, so to the computer industry, their own computer industry rules, people will study data mining will care about its combination with large data volume (effectiveness), will care about its data mining primitives (data mining language), quasi-interfaces and other matters that are only considered when implemented with software.

    The optimization of algorithm performance has been marked by some standards in the data mining industry.

    3.Data mining is still part of machine learning and artificial intelligence, and its core is rules, for data mining algorithms, statistics, but this technology itself is no longer statistics. This is a rule that can be derived by a data mining algorithm, and before such a rule is derived, the algorithm will analyze the data set, which includes many variables (fields in the database), let's say 10, "age" and "salary" are two of them, and the algorithm will automatically extract these two variables based on historical data to arrive at such a rule.

    However, for statistics, it cannot be derived, it can only derive quantitative probability relations, and the derivation of rules should not be in the category of statistics.

    To learn more about Python data mining, check out CDA's Data Analyst course. The course not only cultivates students' hard data mining theory and python data mining algorithm skills, but also takes into account the cultivation of students' soft data governance thinking, and endorses your project for entering famous enterprises. Click here to book a free trial lesson.

  2. Anonymous users2024-02-05

    Data mining, reasoning suggests that you learn a little math.

    Whether it is classification, clustering, regression, recommendation, etc., various algorithms must have a mathematical foundation to be able to understand, a little mathematical foundation, you can also be very confident in the interpretation of results, although many packages of python can be ported, and the results can be released, but if it is accurate, you still need to def it yourself.

    So if you want to do well in this industry, you can't say that math has to be too good, but at least it can't be too bad.

  3. Anonymous users2024-02-04

    Python is a handy script. It is used for data mining, relying on tools and its own algorithm capabilities.

    If it's a data-only calculation, tools like numpy and maplot are usually used. There are also tools for semantic analysis. In addition, the computing power of python is somewhat weak.

    If the amount of data is not enough, it will not be able to support it. This is usually done in conjunction with Hadoop.

    Some algorithms have high real-time requirements, and Python extensions are usually written in C.

  4. Anonymous users2024-02-03

    As long as you can solve practical problems, it doesn't matter what tools you use to learn data mining, Python is the first recommendation here.

    1. Operation of pandas library.

    Panda is a particularly important library for data analysis, and we need to grasp the following three points:

    pandas group computing;

    pandas indexes vs. multiple indexes;

    Indexing is difficult, but it's very important.

    pandas Dobwei table operations with pivot tables.

    2. Numpy numerical calculation.

    numpy array comprehension;

    array index operations;

    array calculations; Broadcasting (knowledge in linear algebra).

    3. Data visualization - matplotlib and seaborn

    matplotib syntax.

    The most basic visualization tool for Python is matplotlib. At first glance, matplotlib is a bit similar to matlib, and it is necessary to figure out what the relationship between the two is, so that it will be easier to learn.

    Use of Seaborn.

    Seaborn is a very beautiful visualization tool.

    pandas drawing function.

    As mentioned earlier, Pandas is for data analysis, but it also provides some mapping APIs.

    4. Introduction to data mining.

    This part is the hardest and most interesting part, and there are a few parts to master:

    The Definition of Machine Learning.

    There is no difference between this and data mining.

    Definition of the cost function.

    train/test/validate

    Definition of overfitting and how to avoid it.

    5. Data mining and malpractice preparation algorithm.

    With the development of data mining, there are many algorithms, and the following only needs to master the simplest, most core, and most commonly used algorithms:

    least-squares algorithm;

    gradient descent; vectorization; maximum likelihood estimation;

    logistic regression;

    decision tree;

    randomforesr;

    xgboost;

    6. Data mining practice.

    The model is understood through scikit-learn, the most well-known library in machine learning.

Related questions
13 answers2024-03-06

Listen carefully in class, practice and review in time after class, and will summarize and summarize, master the ideas and methods of problem solving, and remember not to memorize.

10 answers2024-03-06

1.Build a good foundation and memorize words and phrases. >>>More

8 answers2024-03-06

I think the most important thing to learn math well is interest, and the third is to practice. When you like it, you will do the problem, and after you make the answer, you will be very excited, and naturally you will like math more, so that a virtuous circle will be formed, and the grades will be good. After the exam, you should analyze more, summarize your mistakes more, in fact, the summary is more important than the exam, of course, this involves the truth of life that everyone understands, that is, it is normal for people to make mistakes, but the emphasis is on correction, don't take the results too important. >>>More

6 answers2024-03-06

Personally, in school is to improve knowledge, test scores; Life is to make your life better, you don't want to be cheated on buying things, right? Don't want to be fooled, do you? Math can prevent this. To put it more broadly, it is to contribute to mankind and promote the development of society, haha!

3 answers2024-03-06

The degree of understanding of students' personality, characteristics, psychological quality, and existing knowledge level determines the teaching effect of a lesson to a large extent. In the simplest sense, teaching is the teacher teaching the student to learn, the object of teaching is the student, and if we want to be successful in teaching, we need to have a full understanding of the student. >>>More