-
First, rapidminer, in the world, it is a relatively leading data mining solution, the reason why it will be respected and recognized by everyone, and it has a certain relationship with advanced technology as a basis, it involves a wide range, many experts in the interview process have said that it is always used to simplify some design and evaluation in the process of data mining.
Second, HPCC, it is a plan to speed up the information highway, it is reported that the plan has invested a total of 10 billion US dollars, the purpose of early research and development is to develop scalable software and systems, hoping to develop network technology with gigabits, because its transmission capacity is extremely strong, so it is used in big data analysis.
Third, Hadoop, now many newcomers to big data analysis like to use Hadoop to directly represent big data analysis, it can be seen that it is very important, one of the reasons why it will be respected and recognized by the public is that it first presets the premise that computing elements and storage may fail, and then through multiple angles to ensure that these may be effectively controlled and do not appear.
Four, Pentaho
BI, which is very different from traditional BI products, is a framework, the framework is process-centric, centered on the center as the basis to radiate outward and then to the solution. Pentaho BI has revolutionized big data analytics, making it possible for standalone products such as Quartz and JFeno to be centralized, and can be used as a basis for providing effective solutions for complex business intelligence work.
The above four tools are essential tools for big data analysis posts, and they need to be used flexibly and fluently.
-
There are Excel, SAS, R, SPSS, Tableau Software, Python, etc. SAS, R, SPSS, and Python are all free.
-
1.Tableau, a foreign manufacturer, is a tool that almost everyone in data analysts will mention, with built-in commonly used analysis charts and some data analysis models, which can quickly exploratory data analysis and make data analysis reports. Because it is business intelligence, the problem solved is more oriented to business analysis, and Tableau can quickly make dynamic interaction diagrams, and the charts and color matching are also very handy.
2.The domestic manufacturer Finesoft is very cost-effective, and the self-service BI tool is also a mature data analysis product. Built-in rich charts, no need to call, can be directly dragged and dropped, including some data mining models.
It can be used for quick analysis of business data, dashboards, and visualization screens. It is an affordable alternative to Tableau, which differs from Tableau in that it has more features for enterprise-level data analysis. From the built-in ETL function and data processing methods, it focuses on the rapid analysis and visualization of business data.
It can be combined with big data platforms and various multi-dimensional databases, so it is widely used in enterprise-level BI and free for personal use.
-
Agile self-service pea BI and enterprise-level ABI can all go to the official website of Yixin Huachen to see.
-
Generally, big data databases, such as MongoDB and Gbase, are used for big data data analysis. Secondly, data warehouse tools will be used to clean, transform, and process data to obtain valuable data. Data modeling tools are then used for modeling.
Finally, big data tools are used for visual analysis.
Based on the above description, we will discuss the tools used by process.
1. Big data tools: data storage and management tools.
Big data starts entirely with data storage, that is, with the big data framework Hadoop. It is an open-source software framework that runs on Apache for storing very large data sets in a distributed manner on mass computer clusters. Since big data requires a lot of information, storage is crucial.
But in addition to storage, there needs to be some way to bring all of this data together into some kind of formatted governance structure to gain insights.
2. Big data tools: data cleaning tools.
The use of the data warehouse tool is based on the Hadoop distributed file system, and its data is stored in HDFS. Hive itself does not have a special data storage format, nor does it create an index for the data, you only need to tell the column separator and row separator in the data when creating the table, and Hive can parse the data.
3. Big data tools: data modeling tools.
SPSS: Mainly used for data modeling work, it is stable and powerful, which can meet the needs of small and medium-sized enterprises in the process of business model establishment.
4. Big data tools: data visualization and analysis tools.
ABI, a one-stop data analysis platform for Yixin Huachen, has the tools mentioned above on the platform. In addition to Chinese-style complex reports, dashboards, and large-screen reports, ABI also supports self-service analysis, including drag-and-drop multi-dimensional analysis, Kanban and Kanban sets, and business users can carry out exploratory self-service analysis as they like by simply dragging and dropping.
At the same time, the word impromptu report and slide report make the report more brilliant.
-
At present, there are still many data analysis tools on the market, both domestic and foreign, and I will introduce a few mainstream ones to the landlord.
Abroad: Hengyun.
Tableau: Self-positioning is a visualization tool, similar to the positioning of QlikView, the visualization function is very powerful, the hardware requirements of the computer are high, and the deployment is more complex. Currently, only iOS is supported on the mobile terminal.
QLIKVIEW: The biggest competitor is Tableau, which, like Tableau and many BIs in China, is a new generation of lightweight BI products, which is reflected in modeling, deployment and use. Can only run on Windows system, CS product architecture.
Memory dynamic computing is adopted, the amount of data is small, and the speed is very fast; When the amount of data is large, the memory is very hungry and the performance is slow.
Cognos: The most widely used of traditional BI tools, which has been acquired by IBM. It has a strong database platform and deep expertise in data management, data integration and middleware.
Partial operational, manual modeling, once the requirements change, need to re-model, learning requirements are high.
Domestic: FineBI: FineSoft's self-service BI products, lightweight BI tools, easy to deploy, and multi-dimensional analysis. In the later stage, the seepage jar package is upgraded, which is easy to maintain, and the most cost-effective beam is used.
Yonghong BI: Agile BI software, high product stability. SQL is used to process data, no program interface is supported, and the implementation is outsourced by a third party.
The training time of big data analysis is about 5 months, if you need big data analysis training, it is recommended to choose [Danai Education], which provides a completely real Internet big data development and deployment environment, and students can have dozens of host nodes to complete the development and deployment test. >>>More
26- What big data can't do.
At present, cloud computing and big data analysis are relatively popular, with the guidance of national policies, this industry has a huge talent gap, if you want to know more about data analysis, you can pay attention to the "Jiudaomen Community" to visit the forum, such as the National People's Congress Statistics Forum, there are many resources on it, just find a few books to start reading, the most important thing is to start. If you can't do self-control, you can also sign up for a class, learning from experienced people is always faster than self-learning, and you can avoid a lot of detours.
Python & Tableau: Business Data Analytics & Visualization. Tableau's program is easy to use, allowing companies to drag and drop large amounts of data onto a digital canvas and create charts in the blink of an eye. >>>More
The so-called big data platform does not exist independently, for example, it relies on search engines to obtain big data and conduct business, Ali obtains big data and conducts business through e-commerce transactions, and Tencent obtains big data and starts business through social networking, so the big data platform does not exist independently, the focus is on how to collect and precipitate data, how to analyze data and mine the value of data. >>>More