-
Big data" four characteristics:
1: The volume of data is huge. Up to now, the amount of data for all printed materials produced by humans is 200 petabytes (1 petab = 210 terabytes), while the amount of data for all the things that all humans have said throughout history is about 5 ebytes (1eb = 210 petabytes).
At present, the capacity of typical personal computer hard drives is in the order of terabytes, while the amount of data in some large enterprises is close to exabytes.
2: There are many types of data (variety). This type of diversity also allows data to be divided into structured and unstructured.
Compared with the text-based structured data that is easy to store in the past, there are more and more unstructured data, including network logs, audio, **, **, geographic location information, etc., and these multiple types of data put forward higher requirements for data processing capabilities.
3: It is a low value density. The value density is inversely proportional to the size of the total amount of data.
For example, in a 1-hour project, in continuous and uninterrupted monitoring, the useful data may only be one or two seconds. How to complete the value "purification" of data more quickly through powerful machine algorithms has become an urgent problem to be solved in the context of big data.
4: It is velocity. This is the most significant feature that distinguishes big data from traditional data mining. According to IDC's "Digital Universe" report, it is expected that by 2020, global data will make:
-
1. The volume of data is huge.
Large data volume, refers to large data sets, generally around 10TB scale, but in practical applications, many enterprise users put multiple datasets together, which has formed a petabyte-level data volume; According to the data, the new homepage navigation needs to provide more than 500 billion sheets of A4 paper every day. It has been confirmed that so far the amount of data for all printed materials produced by humans is only 200 petabytes.
2. The data categories are large and diverse.
The data categories are large, the data comes from a variety of data sources, and the data types and formats are becoming more and more abundant, which has broken through the previously limited structured data category and includes semi-structured and unstructured data. The current data type is not only in the form of text, but also in the form of **, **, audio, geolocation information and other types of data, and personalized data accounts for the absolute majority.
3. It is fast processing speed.
In the case of a very large amount of data, it can also achieve real-time processing of data. Data processing follows the "1-second rule", which allows for quick information on the best values from various types of data.
4. It is high in value authenticity and low in density.
With the interest of new data sources such as social data, enterprise content, and transaction and application data, the limitations of traditional data sources are being broken, and enterprises increasingly need effective information to ensure their authenticity and security. For example, for an hour, the data that may be useful is only one or two seconds during uninterrupted monitoring.
-
IBM has proposed the "5V" characteristics of big data:
1. Volume: The amount of data is large, including the amount of collection, storage, and calculation. The starting unit of measurement in the dry news of big data is at least p (1000 t), e (1 million t) or z (1 billion t).
2. Variety: Variety and diversification. Including structured, semi-structured and unstructured data and data, which are specifically manifested in network logs, audio, **, **, Wushan geographical location information, etc., multiple types of data put forward higher requirements for data processing capabilities.
3. Value: The value density of data is relatively low, or it is precious in the waves. With the wide application of the Internet and the Internet of Things, information perception is ubiquitous, information is massive, but the value density is low, how to combine business logic and mine data value through powerful machine algorithms is the most important problem to be solved in the era of big data.
4. Velocity: The data growth rate is fast, the processing speed is also fast, and the timeliness requirements are high. For example, the search engine requires that the news a few minutes ago can be queried by users, and the personalized recommendation algorithm requires the recommendation to be completed in real time as much as possible.
This is a distinctive feature of big data that distinguishes it from traditional data mining.
5. Veracity: The accuracy and trustworthiness of the data, that is, the quality of the data.
-
The four basic characteristics of big data are: large amount of data, rapid response, data diversity, and low value density.
1. Large amount of data.
Terabytes, petabytes, and even exabytes of data need to be analyzed and processed.
2. Require a quick response.
The market changes rapidly, and it is required to be able to respond to changes in a timely and rapid manner, so the data analysis must also be fast, and there are higher requirements for performance, so the amount of data seems to be "large" in terms of speed.
3. Data diversity.
There are more and more unstructured data from different data sources, which need to be cleaned, organized, filtered, and other operations to become structured data.
4. Low value density.
Due to the untimely data collection, the data sample is not comprehensive, the data may not be continuous, etc., the data may be distorted, but when the data volume reaches a certain scale, more data can be used to achieve more real and comprehensive feedback.
Big data, an IT industry term, refers to a collection of data that cannot be captured, managed, and processed by conventional software tools within a certain time frame, and is a massive, high-growth and diversified information asset that requires new processing models to have stronger decision-making, insight and process optimization capabilities.
In The Age of Big Data, by Victor Mayer-Schönberg and Kenneth Kukye, big data refers to the use of all data for analysis and processing without the use of stochastic analysis (sample surveys). The 5V characteristics of big data (proposed by IBM): volume (large volume), velocity (high speed), variety (variety), value (low value density), and veracity (authenticity).
-
The four basic characteristics of big data include: big data is a kind of large-scale data collection that greatly exceeds the capabilities of traditional database software tools in terms of acquisition, storage, management, and analysis, and has four characteristics: massive rotten data scale, rapid data flow, diverse data types, and low value density.
-
Big data refers to a collection of data that cannot be captured, managed, and processed with conventional software tools in an affordable time frame.
In Victor Mayr-Schönberg and Kenneth Cookeer, The Age of Big Data
Medium and big data refers to the use of all data for analysis and processing without using a shortcut such as random analysis (sampling survey). The 4V characteristics of big data: volume, velocity, variety, and value.
Gartner, a research institute for "big data", gives this definition. "Big data" is a massive and high-growth rate that requires new processing models to have stronger decision-making, insight and process optimization capabilities.
and diversified information assets.
According to Wikipedia.
Big data is a collection of data that cannot be captured, managed, and processed with conventional software tools in an affordable time frame.
Big data technology.
The strategic significance lies not in having a huge amount of data information, but in the specialization of this meaningful data. In other words, if big data is compared to an industry, then the key to the profitability of this industry is to improve the "processing ability" of data, and realize the "value-added" of data through "processing".
-
The characteristics of big data are as follows:
1. A large amount
Big data is characterized by large data scale. With the development of the Internet, the Internet of Things, and mobile Internet technology, all trajectories of people and things can be recorded, and data has shown explosive growth.
2. Diversity. The extensiveness of data determines the diversity of data forms. It can be divided into three categories, one is structured data, such as financial system data, which is characterized by a strong causal relationship between data; The second is unstructured data, such as **, **, etc., which is characterized by no causal relationship between data; The third is semi-structured data, such as documents and web pages, which are characterized by weak causal relationships between data questions.
3. High speed. The growth rate and processing speed of data is an important embodiment of the high speed of big data. Different from the traditional production and dissemination methods of newspapers, letters and other traditional data carriers in the past, in the era of big data, the exchange and dissemination of big data are mainly realized through the Internet and cloud computing, and the speed of data production and dissemination is very rapid.
4. Value. The core feature of big data is value, in fact, the value density is inversely proportional to the size of the total amount of data, that is, the higher the data value density, the smaller the total amount of data, and the lower the data value density, the larger the total amount of data. The extraction of any valuable information relies on massive basic data, of course, how to complete the value purification of data in massive data more quickly through powerful machine algorithms.
In recent years, the market size and penetration rate of China's cloud computing industry have continued to grow, making China's public cloud market enter a new stage of development. In addition, driven by the development of 5G commercialization and AI and other technologies, the scale of China's public cloud market has always maintained a rapid growth trend, according to the statistics of the China Academy of Information and Communications Technology, in 2018, the size of China's public cloud market reached 100 million yuan, an increase from 2017. >>>More
Chengdu is not easy to find, mainly because I have little experience, change careers, I must find it, but Chengdu's salary, I hehe, three or five years of experience should be better. From a professional Q user: An anonymous user. >>>More
The so-called big data platform does not exist independently, for example, it relies on search engines to obtain big data and conduct business, Ali obtains big data and conducts business through e-commerce transactions, and Tencent obtains big data and starts business through social networking, so the big data platform does not exist independently, the focus is on how to collect and precipitate data, how to analyze data and mine the value of data. >>>More
1.The Internet of Things is the sensory nervous system of the Internet brain. >>>More
There are many industries where big data can be used, and it depends on how to use it. Does this question specifically want to ask how it is used in a specific industry, or do you want to understand a general application scope? If it's an empty one, it's a lot. >>>More