Let's learn about Data via these 693 free stories. They are ordered by most time reading created on HackerNoon. Visit the /Learn Repo to find the most read stories about any technology.
Data is the king, queen, oil, sun, and the moon.
1. The Difference Between JDBC, JPA, Hibernate, and Spring Data JPA
Connecting a database to a Java application is not an easy process. You need to consider the connection pool, the data access layer, etc.
2. An Intro to Resiliency, DHT, and Autonomous Economic Agents
According to the paper published by Lokman Rahmani et al., the S/Kademlia distributed hash table (DHT) used by the ACN is resilient against malicious attacks.
3. How the TypeScript Pick Type works
The Pick utility Type lets us take types based off existing ones, by selecting specific elements from them. Let's look at how it works and when to use it.
4. Top 10 Open Datasets for Linear Regression
On Hacker Noon, I will be sharing some of my best-performing machine learning articles. This listicle on datasets built for regression or linear regression tasks has been upvoted many times on Reddit and reshared dozens of times on various social media platforms. I hope Hacker Noon data scientists find it useful as well!
5. A Better Guide to Build Apache Superset From source
In this article, we’ll be deep-diving on how to build Apache Superset from the source. The official documentation is too complicated for a new contributor and thus my attempt to simplify it.
6. 11 Best Climate Change Datasets for Data Science Projects
Data is a central piece of the climate change debate. With the climate change datasets on this list, many data scientists have created visualizations and models to measure and track the change in surface temperatures, sea ice levels, and more. Many of these datasets have been made public to allow people to contribute and add valuable insight into the way the climate is changing and its causes.
7. Introducing CatalyzeX: A Browser Extension for Machine Learning
Andrew Ng likes it, you probably will too!
8. How To Scrape Google With Python
Ever since Google Web Search API deprecation in 2011, I've been searching for an alternative. I need a way to get links from Google search into my Python script. So I made my own, and here is a quick guide on scraping Google searches with requests and Beautiful Soup.
9. Pyth and Auros are Bringing Real-Time High-Frequency Data to Blockchain Protocols
Auros, a company specialising in algorithmic trading and market making, and Pyth Network will provide access to high-frequency data in real-time.
10. The Difference Between Privacy and Security
For many, privacy and security seem to be words that are interchangeable. Yet, you can have one without the other and users need to be aware of what they get.
11. Ruby: How to read/write JSON File
In Ruby read and write JSON file to hash can be achieved using File Handling.
12. 6 Reasons to Utilize Sandbox Technology in Game Development
Running a successful online application is an exciting journey, But it is also full of challenges. It starts from product-market fit (PMF)
13. Crypto Singularity and Data Dignity: the Lowdown at Blockstack Summit
This 2019 has been clearly marked by a bearish wave (and also speculative events) and with that comes a breath of much needed space for the builders to have room to build the runway for the solutions proposed in the many white papers distributed all over the web.
14. How to Create a Simple Dashboard with Google Forms and Google Data Studio
Google products are generally free for use, don’t need to go overboard if you handle simple data. No Cost, Just Productive Dashboard
15. What Qualifies You To Be A Cybersecurity Professional?
Data breaches and ransomware attacks are getting more common. If you want to get in on this industry as a cybersecurity professional, you need qualifications.
16. Running a Python Script to Scrape LinkedIn Profiles From Google
LinkedIn is a great place to find leads and engage with prospects. In order to engage with potential leads, you’ll need a list of users to contact. However, getting that list might be difficult because LinkedIn has made it difficult for web scraping tools. That is why I made a script to search Google for potential LinkedIn user and company profiles.
17. Busting AI Myths: "You Need Tons of Data for Machine Learning"
Leading researchers like Karl Friston describe AI as "active inference" —creating computational statistical models that minimize prediction-error. The human brain operates much the same way, also learning from data. A common argument goes:
18. Let Data Shed Some Light in the Midst of COVID-19
The burden the COVID-19 novel coronavirus has placed on the world is enormous. There’s a great thirst for information and clarity. So, we at Logz.io have decided to offer a Community COVID-19 Dashboard Project, so that everyone can better understand how the outbreak impacts the world and their region. We see that as a community effort. We invite the global community of engineers and data scientists to add data to this public dashboard that will cover not just the direct impact of the coronavirus on public health, but other aspects of society as well. We want to help everyone better understand the impact of COVID-19 anywhere around the world.
19. Introduction to a Career in Data Engineering
A valuable asset for anyone looking to break into the Data Engineering field is understanding the different types of data and the Data Pipeline.
20. 5 Mistakes That Make AI Data Labeling Ineffective
Data labeling and annotation is one of the biggest challenges businesses face in developing AI solutions. Here are the top 5 Data labeling mistakes.
21. Reimagining Support and Resistance Indicators with Blockchain Datasets
Support and resistance are two of the best established concepts in technical analysis trading strategies. Conceptually, both support and resistance identify pricing points on an asset that favor a pause or reversal of a given trend. In traditional technical analysis, there are several indicators that model out points of support and resistance all of them are solely based on price trends. Many of those techniques can be extrapolated it to crypto-assets but I think we can do a bit better. For the first time in history, we have an asset class that records parts of the behavior of individual investors and asset holders in public ledgers. That information results a gold mine when comes to estimate objective levels of support and resistance.
22. How to Transform Your Data Into a Voice AI Knowledge Assistant
RAIN executives give a full breakdown of the build out and power of AI Voice Assistants.
23. Top 6 Data Visualization Tools for 2022
In this blog you will discover best data visualization tools to effectively analyze your datasets. Learn about the tools to create intuitive visualization.
24. 10 Best Stock Market Datasets for Machine Learning
For those looking to build predictive models, this article will introduce 10 stock market and cryptocurrency datasets for machine learning.
25. How to Create Dummy Data in Python
Dummy data is randomly generated data that can be substituted for live data. Whether you are a Developer, Software Engineer, or Data Scientist, sometimes you need dummy data to test what you have built, it can be a web app, mobile app, or machine learning model.
26. Good Ways To Make Your Data More Secure
Data security is a business challenge and a business opportunity, not a mere technical task for your IT department.
27. 10 Data Table Libraries for JavaScript
Tables are a useful tool for visualizing, organizing and processing data in JavaScript. To start using them, you need to download a free library or one for a reasonable price. Here is a list of 10 useful, functional, and reliable JS libraries that will help you work with tables.
28. An Intro to No-Code Web Scraping
Web scraping has broken the barriers of programming and can now be done in a much simpler and easier manner without using a single line of code.
29. How to get data from API in Excel
How to get data from API JSON in Excel table with the simplest tutorial with formula. Ready to go open-sourced VBA formula with intuitive video tutorial
30. Increase The Size of Your Datasets Through Data Augmentation
Access to training data is one of the largest blockers for many machine learning projects. Luckily, for various different projects, we can use data augmentation to increase the size of our training data many times over.
31. Scraping Information From LinkedIn Into CSV using Python
In this post, we are going to scrape data from Linkedin using Python and a Web Scraping Tool. We are going to extract Company Name, Website, Industry, Company Size, Number of employees, Headquarters Address, and Specialties.
32. Object-Oriented Databases And Their Advantages
Object oriented database is a type of database system that deals with modeling and creation of data as objects. The main advantage of this database is the cons
33. What are the Best Data Analytics Tools?
Data analytics is used for transforming raw data into useful insights.
34. Distributed Storage is the Best Data Storage Tool for The Metaverse
The most suitable data storage tool for Metaverse is undoubtedly distributed storage.
35. A Guide to Importing Smartsheet Data into SQL Server using SSIS
Easily back up Smartsheet data to SQL Server using the SSIS components for Smartsheet.
36. How to Avoid Consumer Lock-in with The Decentralised Web
<em>This is the third blog post in our series exploring aspects of the Arweave’s decentralised, </em><a href="https://www.arweave.org/"><em>permanent web</em></a><em>. You can catch up with the other parts </em><a href="https://medium.com/arweave-updates/building-the-decentralised-web-part-one-the-problem-9766f1987c91"><em>here</em></a><em> and </em><a href="https://medium.com/arweave-updates/building-the-decentralised-web-part-two-the-components-97409d1fe545"><em>here</em></a><em>.</em>
37. 5 Web3 Startups That Deserve Your Attention
I've worked with Blockchain & Web3 startups consistently since 2017. I've seen teams come and go, businesses flourish only to fail, and bull and bear markets prop up, or kill great ideas respectively.
38. Top 10 Data Science Project Ideas for 2020
As an aspiring data scientist, the best way for you to increase your skill level is by practicing. And what better way is there for practicing your technical skills than making projects.
39. Secrets to Growth Marketing Data Engineering – Even in This Down Economy
Marketing is a big business and it's only going to grow bigger. One reason for this is that marketers need to keep growing the list of data points.
40. Ghost in Your Machine
What’s more frightening than Halloween? Data migration.
41. Data Labeling for AI Products: How to Process Thousands of Data Labels
Here are a handful of recent case studies that show the power of data labeling in action.
42. Facebook's Deepfake Challenge That Will defeat Deepfakes. Hopefully.
Nowadays, we are seeing a new wave and great advancements in different technologies. Things like Deep Learning, Computer Vision, and Artificial Intelligence are improving every single day. And Researchers and scientists are having amazing use-cases with these technologies which can change the direction of our world.
43. Building a Serverless Data Pipeline to Analyze Meetup data
Building a Serverless Data Pipeline to Analyze Meetup data
44. Using SPyQL and Python to Run Command Line Analytics
SPyQL combines Python and SQL to make querying of CSV and JSON data easy. In this tutorial we analyse the geographical distribution of cell towers.
45. Learning SQL Can Give You a Major Career Boost
Why learning SQL is a major career boost with LogicLoop
46. 17 Open Crime Datasets for Data Science and Machine Learning Projects
For those looking to analyze crime rates or trends over a specific area or time period, we have compiled a list of the 16 best crime datasets made available for public use.
47. DOCSIS 3.1 Technology: Everything You Need to Know
In this tech guide, we will cover the important details about DOCSIS 3.1 technology.
48. What’s Wrong With GraphQL?
While GraphQL offers several benefits, there are some potential disadvantages and challenges to using it in C# to consider, before you decide to implement it.
49. Data Mapping and What It Means for Business Strategy
Data mapping solutions powered by AI and ML enable users to bridge the differences in the schemas of data source and destination in a target repository.
50. How to Become the Data Whisperer
The data whisperer is the function sitting between the business and the technologists.
51. A Beginner's Introduction to Database Backup Security
With more companies collecting customer data than ever, database backups are key.
52. How Data Teams Can Benefit From Running Like a Product Team
Product teams have a lot of great practices that data teams would benefit from adopting. Namely: user-centricity and proactivity.
53. Why Are Removed Posts Still Visible on Reddit?
Even if moderators delete a post that is breaking the rules of Reddit, it is still very easy to find.
54. How to model an efficient database for your application
What is Database Modeling?
55. Universal Data Tool: Time Series Data and Audio Labeling [Update 9]
If you haven’t heard of the Universal Data Tool yet, it’s an open-source web or desktop program to collaborate, build and edit text, image, video, and audio datasets with labels and annotations.
56. The Pros and Cons of Collecting Online and Offline Data
57. The Fastest Way to Become A Professional Data Analyst
Sharyph, a tech writer, goes over how to become a professional data analyst.
58. Sensor-based Control in Cobots: Its Opportunities and Challenges
Introduction of the very basic formulation of the major sensor-servo problem, and then presenting its most common approaches like touch-based,
59. Why User Testing is Your Competitive Advantage
Give your users time to explore the UI of your latest product offering so you can gain an understanding of how interaction can be improved.
60. 5 Best Website Categorization Tools
Website categorization refers to the process of classifying websites that users come into contact with into various categories.
61. Building A Secure Data Economy: An Interview with Ocean Protocol's Founder Bruce Pon
Ocean Protocol is technology that allows data sharing in a safe, secure and transparent manner without any central intermediary. Using Ocean Protocol, data scientists and artificial intelligence researchers can unlock and analyze big data, while respecting data privacy.
62. The Importance of Hypothesis Testing
Hypothesis tests are significant for evaluating answers to questions concerning samples of data.
63. The Importance Of Data in Sales in 2022
64. Why Data Governance is Vital for Data Management
Both data governance and data management workflows are critical to ensuring the security and control of an organization’s most valuable asset-data.
65. Building an Airtight Security Funnel Step-by-Step
In this article, we’ll walk through SharePass’s patent-pending security funnel, providing a step-by-step guide to building out your security pipeline.
66. Top 3 Benefits of Insurance Data Analytics
The Importance of data analytics and data-driven decisions across the board and in this case insurance data.
67. Decentralized Storage: Confronting the Challenges
Decentralized storage is still far from mature. Three key obstacles - technical, regulatory and adoption - currently stand in its way.
68. How Smart Analytics Can Help Small Businesses Boost Sales
Technology has taken over the world, now is the time for small businesses to realize that what they need is tech. Smart analytics makes everything easier.
69. How to Build a Decoupled Microservice Using Materialize
One way to handle data in microservice architectures is to use decoupled microservices architecture. This form of architecture can bring many benefits.
70. Decoding MySQL EXPLAIN Query Results for Better Performance
Understanding MySQL explains query output is essential to optimize the query. EXPLAIN is good tool to analyze your query.
71. The Failed Promises of Extract, Transform, and Load—and What Comes Next
Faster, Better Insights: Why Networked Data Platforms Matter for Telecommunications Companies
72. How to Use Public Keys in Data Lifecycles
The data lifecycle (also known as the information lifecycle) refers to the full-time period during which data is present in the system.
73. Kafka Authorization And NiFi Encryption to Amazon S3
Any typical ETL/ELT pipeline cannot be completed without having "kafka" keyword in the discussions.
74. Useful Resources for Data Structure & Algorithm Practice
These four resources may be useful for learning about data structures and practicing making algorithms for your advanced programming needs in your work.
75. Benefits of Corporate Data Backup and Best Practices to Keep in Place
Nowadays, companies are increasingly relying on corporate data backup solutions to guarantee the safety and recoverability of their data. Read on to learn more
76. A Guide to Web Scraping With JavaScript and Node.js
With the massive increase in the volume of data on the Internet, this technique is becoming increasingly beneficial in retrieving information from websites and applying them for various use cases. Typically, web data extraction involves making a request to the given web page, accessing its HTML code, and parsing that code to harvest some information. Since JavaScript is excellent at manipulating the DOM (Document Object Model) inside a web browser, creating data extraction scripts in Node.js can be extremely versatile. Hence, this tutorial focuses on javascript web scraping.
77. Using User Data After Google's Third-party Cookies Ban
Google announced that it would ban the usage of third-party cookies; it has made a lot of publishers afraid that they won't be able to utilize user data.
78. Data Lakehouses: The New Data Storage Model
Data lakehouses are quickly replacing old storage options like data lakes and warehouses. Read on for the history and benefits of data lakehouses.
79. How to Migrate Data from an MSSQL Server to PostGreSQL?
Thinking of shifting to a new database management engine? Here's how to migrate data from SQL server to PostgreSQL.
80. How to Efficiently Manage Queues in SQL Databases
A queue using an SQL-database? well, you need to know pros and cons, and a typical implementation.
81. How To Solve the Problem With Key Metrics In a B2B Product
To learn how B2B companies solve the problem with key metrics in a product, I caught up with Yuri Brankovsky who has worked in multiple digital products.
82. SubQuery to Make Blockchain Data Easily Accessible on the Cosmos Blockchain
SubQuery is a blockchain developer toolkit that allows for web3 infrastructure through a custom open-source API between data and decentralized applications.
83. Data Journalism 101: 'Stories are Just Data with a Soul'
Gone are the days when journalists simply had to find and report news.
84. 'At the Coalface of Implementing Data Stacks': kleene's Co-founder & CEO Andrew Thomas
2-minute look at the building of kleene.ai through a founder's eyes.
85. 9 Data Trends You’ll See in 2023
2022 saw the data space grow by leaps and bounds. Here are the top 9 things our team of data experts expects to see in 2023.
86. Make Data-Driven Decisions With Power BI Consulting & Implementation
Power BI offers a solution for businesses that need to manage large volumes of data. It's designed to help with even the heaviest data flows business have.
87. Scraping Glassdoor Job Data
Glassdoor is one of the biggest job markets in the world but can be hard to scrape. In this article, we'll legally extract job data with Python & Beautiful Soup
88. AI Is Making Our Concrete Buildings And Bridges Safer
AIs application to civil engineering and concrete construction is the future of structural safety. There have been various successful & innovative applications.
89. Proven Metrics and Important KPIs for Startups to Measure Success
What are the most important KPIs for startups to measure success? Find your answer in this article and learn what key product metrics to track to enable growth.
90. What is an API, Simply Explained
Connectivity is something amazing. Right now, we are used to use our computers or phones to buy, post, watch, etc. We can do lots of things actually. We are connected to the world and to each other.
91. What is a Citizen Data Scientist and How Do You Become One?
Data science has been democratized for the most part. AI is now mainstream! It's no longer the exclusive province of large companies with deep pockets.
92. What's in Store for Privacy and Personal Data Protection in 2022?
2021 saw many advancements in internet privacy, what does 2022 have in store?
93. The Black Market for Data is on the Rise
Once the laughingstock of the Internet, hackers are now some of the most wanted criminals in the world.
94. A Tale of Two Cities: Economic vs Digital Democracy
More than new laws and fines, we need to reconsider data ownership as a whole and discover new structures that place control back into the hands of the people.
95. An Overview of Cyber Insurance for MSPs
Cyber insurance is a type of insurance policy designed to protect businesses and individuals against losses resulting from cyber attacks and data breaches.
96. 8-Ways Data Mining Can Improve your Business
If your company is trying to make sense of the customer data, here’s a not-so-surprising fact for you. You aren’t alone. Far too many companies want to understand data and gain an in-depth insight into the information they are sitting on. Let’s be clear that today, the success of a business lies in how efficient their data mining process is. Their expertise to process the available data as this can help them to decipher age-old questions that make or break them:
97. Metrics, logs, and lineage: 3 Key Elements of Data Observability
Data observability is built on three core blocks: metrics, logs, and lineage. What are they, and what do they mean for your data quality program?
98. Join to Write Data Into Your First Decentralized Database
The DB3 Network is a start-up project to build a decentralized, permissionless platform for programmable data processing.
99. What is Data Analytics and How It Can Be Used
WHAT IS DATA ANALYTICS?
100. Five Common Reasons Why Data Integration Projects Fail
It’s 3 AM. My alarm goes off and I groggily climb out of bed and crack open my laptop. One of our biggest customers needs their data delivered by 9 AM, and I’m getting up before sunrise to triple-check every data point before their delivery. Our data platform was built with hundreds of data audits, but this customer’s delivery was just too complex to feel 100% confident that we’ve captured all potential issues. This scenario would soon become a typical morning for me. Wake up. Coffee. Pray to the data gods for an inbox without 500 Zendesk ticket escalations.
101. How to Leverage Predictive Analytics in Your eCommerce Businesses
Predictive analytics is able to predict which customers are most likely to churn or which products are most likely to be returned. Here are 6 other use cases.
102. What is Digital Footprint Management?
Your digital footprint refers to the trail of information you generate when creating, sharing, or storing any digital data.
103. Reimagining Marketing Teams: Why Creatives Should Embrace the Tech Side of the Business
With the pace of the business world, it is increasingly important for marketers to embrace new technological breakthroughs to stay ahead of the curve.
104. Diffusion by Push Technology Now Supports MQTT
Support for the OASIS MQTT open standard protocol is the main feature added to Diffusion 6.6 Preview 2, the latest release of the Diffusion® Intelligent Event Data Platform.
105. Best Practices For Backend Data Security
Backend data security relies in encryption, access control, data backup and other such features to exist. These best practices are intended for the backend.
106. IoT in Smart Buildings – A Behind-The-Scenes Picture on Data Utilization
IoT and smart buildings are all about data, but how is all this data used and what kind of data can you get? Read Haltian's article about smart buildings & data
107. SocialFi — Social Networks on the Blockchain & What to Expect From Web 3
How do social networks of the future differ from the usual ones, and what projects to expect in 2023.
108. The Noonification: Immigrant Teens Are Working Dangerous Night Shifts in Factories (11/21/2022)
11/21/2022: Top 5 stories on the Hackernoon homepage!
109. Mastering NumPy Arrays(Part 1): Stacking and Splitting
A comprehensive guide for NumPy Stacking. How to stack numpy arrays on top of each other or side by side. How to use axis to specify how we want to stack arrays
110. Web3, Data, and the Issue of Self-Sovereignty
Whoever owns your data, owns your decisions.
111. 6 Biggest Differences Between Airbyte And Singer
We’ve been asked if Airbyte was being built on top of Singer. Even though we loved the initial mission they had, that won’t be the case. Aibyte's data protocol will be compatible with Singer’s, so that you can easily integrate and use Singer’s taps, but our protocol will differ in many ways from theirs.
112. Encoding Categorical Data for ML Algorithms
Encoding is a technique used to convert categorical data to numerical representations to be able to use the data in machine learning algorithms.
113. Self-service Data Preparation Tools Can Optimize Big Data Efficiency for the IT Team
Self-service data preparation tools are designed for business users to process data without relying on IT, but that doesn’t mean IT users can't benefit too.
114. What installing the Messenger app tells us about Facebook
Messenger’s onboarding is a great case study of manipulative design
115. Top 20 Twitter Datasets for Machine Learning Projects
It is often very difficult for AI researchers to gather social media data for machine learning. Luckily, one free and accessible source of SNS data is Twitter.
116. What the 2020 Toilet Paper Shortage Can Teach Us About AI
AI-driven technology expedites this process and helps companies meet consumer demand, cater to consumer concerns, and personalize the consumer experience.
117. Quantum-resistant Encryption: Why You Urgently Need it
The Second World War brought to the front burner the world of espionage, which is the precursor of cybersecurity, as is seen in the modern world. Technological advancements such as the quantum computer necessitate that we take the war against cybercrimes to another level.
118. Jetpack DataStore in Android Explained
The JetPack Datastore is an Android data storage solution that is helpful when making Android-based mobile apps by providing a way for data to be retrieved.
119. Going Beyond "Have You Tried Unplugging It and Plugging It Back In?" Taking IT to the Clouds
In this new era of digital transformation, there isn’t really a good excuse for companies that claim they want to succeed but aren’t willing to invest in employ
120. How to Query Deeply Nested JSON Data in PSQL
Recently I had to write a script, which should’ve changed some JSON data structure in a PSQL database. Here are some tricks I learned along the way.
121. How Data-Driven Coaching Helps Employees Reach Their Potential
Data is everywhere. In the business world alone, we use it to track search engine traffic, monitor website activity, land sales, improve customer service.
122. On the difficulty of creating a data science code of ethics
undefined
123. The Growth Marketing Writing Contest: Round 1 Results Announced!
Growth marketers - the wait is OVER. The first round results announcement of the Growth Marketing Writing Contest is now LIVE!
124. Decoding MySQL EXPLAIN Query Results for Better Performance (Part 2)
Understanding MySQL explains query output is essential to optimize the query. EXPLAIN is good tool to analyze your query.
125. Facebook and Anti-Abortion Clinics Have Your Info
Facebook is collecting ultrasensitive personal data about abortion seekers and enabling anti-abortion organizations to use that data
126. Why the Gaming Chip Shortage in the Gaming Industry is not Game Over
The global chip shortage has taken the gaming industry by storm, as it is one of the biggest industries most affected, and the resupply of consoles can last unt
127. Data Loss Prevention: What is it, and Do You Need it?
Data Loss Prevention is a set of tools and practices geared towards protecting your data from loss and leak. Even though the name has only the loss part, in actuality, it's as much about the leak protection as it is about the loss protection. Basically, DLP, as a notion, encompasses all the security practices around protecting your company data.
128. Watch Out for Deceitful Data
Nowadays, most assertions need to be backed with data, as such, it is not uncommon to encounter data that has been manipulated in some way to validate a story.
129. What Is Modern Business Intelligence?
This article gives insight into some basic features and functionality that a desirable modern BI software has and illustrated some examples.
130. What is RFM (Recency, Frequency, Monetary) Analysis?
RFM analysis is a data-driven customer segmentation technique that allows marketing professionals to take tactical decisions based on severe data refining
131. Exporting Data to Fit Your Needs
A lot of the work we do at ChartMogul centers around how we display and present your data in a clear and transparent way.
132. Understanding the 'Data is the New Oil' Analogy
Earlier, we lived in industrial and post-industrial societies, and gas and oil were the only things of value. Now, it’s the age of information society and data has replaced petrol as the economy’s driving force. The reason is that with the help of Big Data, people significantly improve production efficiency and business economics. That’s true.
133. From 1999 to 2020, Google Grew from 10k to 4.6B Daily Searches
The Internet Live Stats graph above pictures Google's first 13 years. Today they report 4,517,847,993 DA(internet)Us currently do 4,781,309,755 daily Google searches, according to Internet Live Stats.
134. AI and Crowdsourcing: Using Human-in-the-Loop Labeling
With human-in-the-loop data labeling, humans and machines complement each other, which results in simple solutions for a variety of difficult problems at scale.
135. Have You Read Your Privacy Notice in Detail?
Do you recall every company you have given consent to use your data as you browse a website or sign-up to a ‘free’ service? It's time we moved beyond consent
136. How to Make Sure Your Nonprofit is Complying With HIPAA
It is important for your non-profit organization to comply with HIPAA to protect health data. Here's how you can do so.
137. Blockchain Technology Improves Data Authentication and Transparency in Healthcare
Blockchain is the secret to trusting the data as it moves into our healthcare ecosystem.
138. Using a Relational Database to Query Unstructured Data
Using Relational Database to search inside unstructured data
139. How Data Scientists Start Automating Their Tasks With Python
Introduction to automation with python and my top 3 most used code snippets.
140. Trino: The Open-source Data Query Engine That Split from Facebook
If you want to accelerate Trino queries with a response time of seconds to minutes, click here to learn how Trino helps engineers.
141. NLP Datasets from HuggingFace: How to Access and Train Them
The Datasets library from hugging Face provides a very efficient way to load and process NLP datasets from raw files or in-memory data. These NLP datasets have been shared by different research and practitioner communities across the world.
142. Principles of a Clean Relational Database
The article describes how a relational database should be designed to properly work in OLTP mode.
143. Behavioral Analytics: The Foundation of Targeted Marketing and Predictive Analytics
Learn how to capitalize on your business standards and increase the conversion rate by approximately 85% by analyzing customer behaviors with data you collect.
144. The hidden risk of ethics regulation
Regulating the tech industry won’t fix its ethical problems, it might make them worse. Mike Monteiro has written the most compelling argument I have seen for regulation. Regulation would address many of the kinds of ethical risks that have made headlines recently. But I think it would leave many risks in place and introduce new risks — a more systemic risk, in fact — that in the long term would actually expose the public and the industry to more potential downside that it currently faces. Regulation at scale requires rules that stipulate what is ethical and what is not, in the case of the discussion of the ethics.
145. 6 Tips to Get More Value Out of Your Microsoft Power BI Dashboard & Reports
By using Microsoft Power BI, you increase the efficiency of your company through its interactive insights and visual clues. Here are 6 tips for Power BI users.
146. The Noonification: The Idea of “Safe Cex” Should Stay in 2022 (1/20/2023)
1/20/2023: Top 5 stories on the Hackernoon homepage!
147. Defining the Problem in Your Data Science Project Can Lead to Success
Defining the Data Science Problems the right way is hard work. The failure rate of various data science initiatives is really high — often ~70-80%.
148. How 5 Massive Data Breaches Could Have Been Prevented
One of the biggest losses for companies? Inadequate cybersecurity.
149. A High Level Explanation of Data Types for Decision Makers
There are three different types of data: structured data, semi structured data, and unstructured data.
150. Brace Yourself: Data Cleanup is Coming
It goes without saying that data is the cornerstone of any data analysis.
151. Fintech Should Focus On Long-Term Vision, Not Short-Term COVID Buzz
The COVID19 crisis has been playing out globally for over half a year, or almost a year counting its early phase in China. It’s been hurting a lot of sectors, but one particular sector stood to benefit — Fintech.
152. Automated Offline Backups Can Save the World
Ransomware is worse than malware: Systems and data are all locked up, and backups are all encrypted, too.
153. Machine Generated Whiskey
Thanks to Microsoft, and a lot of whiskey data.
154. Debugging My Love Life
Tinder's "Top Spotify Artists" feature is relatively shallow, but could be fixed easily. Here is a demonstration of how it works currently and what can change.
155. Software Development Tricks Coding for Beginners and More
This week on HackerNoon's Stories of the Week, we looked at three articles that covered the world of software development from employment to security.
156. Building an Efficient AI Platform for Data Preprocessing and Model Training
Lei Li, AI Platform Lead, and Zifan Ni, Senior Software Engineer from Bilibili, share how they increased the training efficiency on their AI platform.
157. Denial Of Service (DoS) Attacks: Nature And Method Of Infection
Denial Of Service or DoS attacks work by overloading the target host’s bandwidth, preventing other users from accessing the affected server, denying service.
158. The Three Basic Benefits of a Virtual Data Room
The popularity of online virtual data rooms has increased over the years. These are innovative software used for safe storage and sharing of files. As the world is modernizing, people are using advanced technology to carry out their daily tasks. As everything today is digital, it becomes more and more crucial to look for new methods to store files. Gone are the days when people used to pile up hard copies of all the files in the offices. Some people are still seen doing that which wastes half of their time. Imagine you have a business meeting in some time and you can’t find a specific file because there is a huge unorganized bundle of files in your office. With virtual data rooms, all your files are well organized. You do not have to get into a hassle of finding a certain file. With just one click, the file appears in front of you in no time.
159. Meet Data: The Driving Power of Fintech
Off late, “Fintech” has been and remains to be a buzzword. It is transcending beyond traditional banking and financial services, encompassing online wallets, crypto, crowdfunding, asset management, and pretty much every other activity that includes a financial transaction. Thereby competing directly and fiercely with traditional financing giants and their methods.
160. Executing a T-test in Python
In today’s data-driven world, data is generated and consumed on a daily basis. All this data holds countless hidden ideas and information that can be exhausting
161. Artificial Intelligence and Big Data
Artificial Intelligence and Big Data. These two terms seem to permeate the tech world in every possible way one can think of. Along with giant terms like Machine Learning, IoT, blockchain and related ones, AI and Big Data are set to dominate our world in the years ahead.
162. Public Health Improvements as a Result of Data Usage and Analysis in Healthcare
Big data has made a slow transition from being a vague boogie man to being a force of profound and meaningful change. Though it’s far from reaching its full potential, data is already having an enormous impact onhealthcare outcomes across the world — both at the public and individual levels.
163. Can Data Automation Transform The Workplace?
Every minute, a staggering 1,820 terabytes of data is created around the world. That’s more than 2.5 quintillion bytes every day! This data takes many forms, from Tweets and Instagram posts to the generation of new bitcoin.
164. If You Have Important Data: Make Sure Its Protected
Data transfer is very important and it keeps happening almost every minute. As we chat on various social media applications or even like a post, there is a transfer of information that is happening. While we may not be too bothered about the way in which information and data are transferred from the receiver to the sender and vice-versa, we, of course, would be concerned about the safety of the data and information that is flowing on the internet and other forms of communication.
165. Why Python Is Leading the Charge in Data Analytics
Python is one of the oldest mainstream programming languages, which is now gaining even more ground with a growing demand for big data analytics. Enterprises continue to recognize the importance of big data, and $189.1 billion generated by big data and business analytics in 2019 proves it right.
166. Building an AI Red Team to Stop Problems Before They Start
An incredible 87% of data science projects never go live.
167. Hacking Emotions: Interpreting Current Events through News Sentiment
How can organizations best measure news sentiment to gain insights about customer and investor behavior?
168. Why co-location is the best way to mine bitcoin
Since the recent Bitcoin halving event, most small and medium crypto miners have had to shut down their mining rigs. Simply put, it is not profitable to have a mining rig in your home at current market prices. However, there are some solutions to the issue.
169. 4 Critical Steps To Build A Large Catalog Of Connectors Remarkably Well
The art of building a large catalog of connectors is thinking in onion layers.
170. Google Analytics Heartbeat Data Visualization
An experiment in real-time data visualization
171. Ethereum Merge: “15 Days Before and After” Data Analysis, Сensorship in Ethereum Blockchain
In this article, I will analyze what actually happened, taking as a basis 15 days before and 15 days after the transition.
172. Trends Uncovered by Scraping OpenSea Data to Analyze NFT Collections
Web Scraping OpenSea to get NFT data and trade history about the NFT collection The Bored Ape Yacht Club
173. Application Programming Interface (API): What it is and How to Use it
APIs are less like USB ports or fire hoses than they are as a person at a help desk in a foreign country. An API will not give you all of a program’s information or code (like a fire hose), because what would stop you from replicating the entire code base? Instead, an API provides you with data its programmers have made available to outside users. Even so, you have to know the language and ask the right questions to do anything with this data.
174. Kimball & Inmon vs. the Retail Store
Years back I had read a blog about database scalability where it simplifies definition of scalability with activities in a kitchen. I was quite surprised how successful the comparison was. Come to think about it, technology is and should be inspired by what’s happening around us. This thinking pushed me into thinking and linking technology with my everyday life.
175. The Noonification: Internet Archives Silent Killer (2/2/2023)
2/2/2023: Top 5 stories on the Hackernoon homepage!
176. So You Just Became a Data Science Manager... Now What?
With the rise of data science there has been the rise of data science managers. So what do you need to keep in mind if you wish to join these data translators that are acting as a conduit between the business and technical data teams? Going from a practitioner to a manager — your job now is to make sure that data resources are being used optimally so how do you go about doing this effectively?
177. Using a REST API with Python
Requesting fitness data (backlog) from Terra requires HTTP requests, so I’m writing an essential guide here on using a REST API with Python.
178. The Retail Evolution: Customers Demand Enhancements to the Shopping Experience
Andreas Hassellof, Founder and CEO of Ombori, explores how changing customer behaviors impact retail, and drive retail technology innovation.
179. A Technologist Manifesto against Data Imperialism
Tactical Mission
180. [Everyday Tech Solutions] Turning Feedback Data into Actionable Advice
If you're working on something that users actually use, then you're most likely also acquiring data en masse. When it comes to free text feedback, this data might get lost or stay in the hands of some analysts. How to take a few easy steps, to turn that data into actionable steps instead.
181. 5 Reasons Why VPNs are not Safe in 2021
All good things must come to an end, which may be true for the VPN in 2021. VPNs have been a useful enterprise tool for companies since they started in the 90s,
182. How Can You Minimize Your Online Footprint
You may be shocked to find out what information is available about you and how it could be used. Here are steps you can take to minimize your online footprint.
183. Web Scraping Using Node.js
While there are a few different libraries for scraping the web with Node.js, in this tutorial, i'll be using the puppeteer library.
184. 3 Types of Tools Needed For Effective Project Management, Remotely
Project management is perhaps the most crucial job in any organization. This is because the success of every project is directly proportional to slowly reaching towards the goals and objectives of the organization that were established well in advance.
185. Can Your Organization's Data Ever Really Be Self-Service?
Self-serve systems are a big priority for data leaders, but what exactly does it mean? And is it more trouble than it's worth?
186. Tableau Vs. Power BI: The Complete Comparison
The world of analytics is continually evolving, introducing new goods and adjustments to the modern market. New companies are entering the market and well-know
187. Fenwick Tree Explained
Fenwick Tree is an interesting data structure that uses binary number properties to solve point update and range queries in your code in some situations.
188. Using Machine Learning to Build a Ride Acceptance Model for Uber
Objective: Predict if a driver will accept a ride request or not and find the probability of acceptance.
189. Humans, Data and Emergent Factors
Emergent factors refers to the factors that can arise from the interaction of a group of agents or a system.
190. Building a Propensity Model to Target Users Better in Marketing Campaigns
Propensity model to figure out the likelihood of a person buying a product on their return visit. We need to identify the probability to convert for each user.
191. What Happens When You Get Sick Right Now?
We are living in a weird time. Day by day we see more & more people coughing and getting sick, our neighbors, coworkers on Zoom calls, politicians, etc… But here’s when it becomes really, really scary — when you become one of “those” and have no clue what to do. Your reptile brain activates, you enter a state of panic, and engage complete freakout mode. That’s what happened to me this Monday, and I’m not sure I’m past this stage.
192. Is Cloud Computing Really More Sustainable?
We've all heard the environmental benefits of cloud computing, but there are some cons as well. Is the cloud really more sustainable?
193. Facebook Peeked at Your Info When You Applied for Student Aid Online
For millions of prospective college students, applying online for federal financial aid has also meant sharing personal data with Facebook, unbeknownst to them.
194. A How-to Guide for Data Backup and VM Modernization
Data is everywhere it is something that we all rely on. It is used by individuals and large organizations that collect and store hundreds of files a day.
195. Merging Datasets from Different Timescales
One of the trickiest situations in machine learning is when you have to deal with datasets coming from different time scales.
196. From 2000 to 2018, Total Websites Grew From 17 Million to 1.6 Billion
Screenshot from Internet Live Stats. Interestingly, the internet experienced year over year decline in 2010, 2013, 2015, and 2018. Wonder why? Here are more site that source to measure aggregate website growth:
197. 8 Cloud Computing Trends to Watch in 2021
Cloud computing has grown exponentially in the past decade and is not about to stop. As predicted by Forrester’s research, the global public cloud infrastructure will grow 35% in 2021, many thanks to the pandemic. Due to the lingering effects of covid-19 in 2021, the cloud will be the key focus for organizations looking for increased scalability, business continuity, and cost-efficiency.
198. How Blockchain Can Democratize Data Collection And Why You Should Care
Blockchain technology democratizes data collection and gives individuals data sovereignty in the context of social media being able to collect so much of it.
199. Data Science Teams are Doing it Wrong: Putting Technology Ahead of People
Data Science and ML have become competitive differentiator for organizations across industries. But a large number of ML models fail to go into production. Why?
200. Data Science With R Programming — Coding Interview Questions
R is a tool used for data management, storage, and analysis in the field of data science. It has applications in statistical analysis and modeling.
201. Understanding the Differences between Data Science and Data Engineering
A brief description of the difference between Data Science and Data Engineering.
202. WTF is Application Monitoring? Do you Need it?
Now that you've built the swanky software application of your dreams, what’s next?
203. How to Track Form Completions with Google Tag Manager
Setting up a website is relatively easy in 2020. Gone are the days when you had to code the whole thing on notepad and then connect to your host with some additional FTP software.
204. 7 Data Quality Metrics To Prioritize
Having high-quality data can make or break your projects in machine learning or business management. These 7 data quality metrics have the largest impact.
205. How to Use Appsmith, Airtable, and Notion to Build a Video Sorting Tool
According to Forbes, 82% of content generated this year is likely to be video.
206. What Will be the 3 Biggest Software Development Trends of 2022?
The number of software developers globally is due to almost double by 2030, yet InterSystems research has found that more than 8 out of 10 developers currently feel they work in a pressured environment. Creating a better experience for developers is key for inciting innovation, but the current data environment continues to evolve in ways that challenge the experience at every turn.
207. Lets Study the Seattle Airbnb Data
So, recently I started my Udacity Nanodegree on Data Scientist. To be honest the first project speaks about CRISP-DM which is CRoss-Industry Standard Process for Data Mining.Let's leave it apart and start working on what we learn from the dataset.
208. So You Got Data... What Now?
Data, data everywhere…but not enough to decide!
209. How to Deal with Tech Trust Deficit
We’re more dependent on tech and e-commerce than ever before, and customers want to know that brands are protecting their data and privacy.
210. A Quick Guide To Business Data Analytics
For many businesses the lack of data isn’t an issue. Actually, it’s the contrary, there’s usually too much data accessible to make an obvious decision. With that much data to sort, you need additional information from your data.
211. Differences and Applications of Web Scraping and Data Mining
Learn the differences between web scraping and data mining and how to apply them.
212. 10 Common Coding Mistakes Data Scientists Should Watch Out For
A look at common mistakes that data scientists make in the process of service delivery.
213. Self-Sovereign Identity Based Access Controls or SSIBACs: An Overview
A recent academic paper uses Hyperledger infrastructure to conduct access control processes using decentralized identifiers, verifiable credentials, and conventional access control models.
214. Knowing Where and When to Enforce the Uniqueness of Your Data
This article looks at data uniqueness and discusses where it should be enforced. At application level or database level?
215. A Dash of Data, a Spoonful of Intuition
It's important to make informed decisions that positively impact your organization. But how do you know when to rely on data and when to use intuition?
216. 14 IoT Adoption Challenges That Enterprises Need To Overcome
Tech-enabled industries are never short of buzzwords and the latest to join the bandwagon is the Internet of Things. Though this Industry 4.0 solution is available for long, the need for adding smart technology has increased presently among industry leaders.
217. How to Back Up Exchange Online Data
218. A Javascript Queue Structure for Buffered Data
If you work with buffered data such as Audio/Video Frame data, you have no doubt appreciated the features of Typed Arrays that came with ES2017 javascript. The ability to move, duplicate, manipulate blocks of data using object methods is achieved by 'imposing' a dataview on the data blocks. These have made buffered data processing a breeze and fast (avoid slow for-loops and extra code ). A detailed discussion of typed arrays is found here: javascript typed arrays.
219. Opinion: There’s Nothing Wrong With Being Tracked by Google
Why you should be happy about companies collecting your data.
220. Good Data is in the Blood of Trusted Applications
Unfortunately, your app is really only as good as the data that supports it and most of the time, that can be out of your control.
221. On Dynamic Observability and Team Culture with Liran Haimovitch, Rookout CTO
Rookout Co-Founder and CTO, Liran Haimovitch, shares the origin story of their debugging tool, what excites him about the startup life, PLG, and more.
222. How Nutanix VM Works
In the era of enterprise cloud, modern enterprise datacenter must support virtualization with high availability and live VM migration. The traditional storage area networks (SAN) or network attached storage (NAS) doesn’t suit. Instead, they are ideal to manage a logical unit number (LUN). A LUN can be a single disk, an entire redundant array of independent disks (RAID), or disk partitions.
223. How Will Blockchain Fix the Centralization of Data?
“In order to have a standard of value [cryptocurrency] must stand outside all value schemes. It must have value in and of itself."
224. Data Is Now a Luxury Good: Here’s Why (It Shouldn’t Be)
When was the last time you read a privacy policy?
225. The Best 50 Sites to Learn About Data Science
Blogs, they’re everywhere. Blogs about travel, blogs about pets, blogs about blogs. And data science is no exception. Data science blogs are a dime a dozen and with so many, where do you start when you need to find the most valuable information for your needs?
226. Top 13 Data Visualization Tools for 2023 and Beyond
With the enormity of data, data visualization has become the most sought-after method to depict huge numbers in simpler versions of maps or graphs.
227. VMware vCenter Converter Alternatives for V2V?
VMware is commonly used to set up data centers for organisations. If the most commonly used solution doesn't work, some alternatives may work better for you.
228. What to Expect from AI in 2022
AI is too complex and dynamic a technology to be approached one-sidedly, only from the business or IT side. Read the article and find out more
229. Use Up-Sampling and Weights to Address Imbalance Data Problem
Have you worked on machine learning classification problem in the real world? If so, you probably have some experience with imbalance data problem. Imbalance data means the classes we want to predict are disproportional. Classes that make up a large proportion of the data are called majority classes. Those that make up a smaller portion are minority classes. For example, we want to use machine learning models to capture credit card fraud, and fraudulent activities happens approximately 0.1% out of millions of transactions. The majority of regular transactions will impede the machine learning algorithm to identify patterns for the fraudulent activities.
230. Creating a digital-first credit model designed for underbanked micro-businesses with Sean Salas
Camino Financial is an AI-powered Community Development Financial Institution (neo-CDFI) offering affordable credit to underbanked Latinx entrepreneurs.
231. Why Use Pandas? An Introductory Guide for Beginners
Pandas is a powerful and popular library for working with data in Python. It provides tools for handling and manipulating large and complex datasets.
232. How Pastel’s Cascade Stores NFT Data Securely
Cascade is a protocol that enables the storage of NFT data and metadata permanently within a highly redundant, distributed fashion with a single upfront fee.
233. How to Improve Data Quality in 2022
Poor quality data could bring everything you built down. Ensuring data quality is a challenging but necessary task. 100% may be too ambitious, but here's what y
234. Why Should Your Business Adopt Cloud-Based IT Solutions?
Data storage and access have long been a concern for businesses. A ton of data is created daily, and efficient storage is a must to keep track of everything.
235. 10 Best Hugging Face Datasets for Building NLP Models
Hugging Face offers solutions and tools for developers and researchers. This article looks at the Best Hugging Face Datasets for Building NLP Models.
236. Is There a 'GitHub For Data Scientists'?
What if I say that there is a place where you can not only store your Data Science projects but also experiment on them right then and there?
237. The Most Commonly Used SQL Queries by Data Scientists
SQL (Structured Query Language) is a programming tool or language that is widely used by data scientists and other professionals
238. These Companies Are Collecting Data From Your Car
Most drivers have no idea what data is being transmitted from their vehicles, let alone who exactly is collecting, analyzing, and sharing that data...
239. An Intro to AI Powered Product Development
The global product development services industry was close to $8 billion in 2020.
240. Using Automation in the Probate Process
Data-heavy manual processes like probate cases are common throughout the legal industry. Thankfully, automation offers a solution to this common problem.
241. MongoDB vs. DynamoDB: Choosing the Best Database for Your Business
All about MongoDB vs DynamoDB. Explore benefits, and in-depth comparison to find out the best choice for your business app.
242. Intro to Structured Query Language (SQL)
In the post, I used a simple SQL query to explain how certain things work in SQL. I also outlined problems with the query and potential ways to improve the code
243. Why Home Media Servers Are Worth Your Time
Files are getting larger and space for your favorite content can be at a premium. Getting your own server can make storing data so much easier.
244. Why You Should Scrape Data from Social Media Websites for Brand Audit
Social media scraping involves automating the process of extracting data from social media websites such as Twitter and Instagram through web scraping.
245. NAS Data Backup Is Essential For Remote Offices
Network Attached Storage (NAS) is a smart, dedicated data storage system that connects to storage drives, allowing multiple users to collaborate and share data.
246. 3 Ways You Can Build and Update Websites Using Data Pushes
Data is getting more and more accessible and is increasingly being used to inform the way businesses operate.
247. 4 Tips To Become A Successful Entry-Level Data Analyst
Companies across every industry rely on big data to make strategic decisions about their business, which is why data analyst roles are constantly in demand.
248. How to Design a Comprehensive Framework for Entity Resolution
In this blog, we will be looking specifically at the issue of resolving entities (also known as record linkage), as well as discussing a comprehensive framework
249. [Infographic] The State of Conversational AI in 2020
Conversational AI was always poised to take off in 2020. In fact, Gartner predicted that 80% of businesses would implement some sort of conversational interface by the end of this year. With the emergence of COVID-19 came compounded growth for the category - and I wanted to capture just how far we’ve come. So for the conversationally curious out there, I created this infographic that offers a clear depiction of where conversational AI stands at this very moment in time.
250. Election Subversion and Manipulation Is Not New
Technology has had a significant impact on how election campaigns are run, and it has been used in a variety of ways to influence election outcomes.
251. Using Data Attribution Comparison Table in Google Analytics 4
The configuration of Google Analytics 4 is not a walk for those unfamiliar with analytical tools or data setup.
252. How We Use dbt (Client) In Our Data Team
Here is not really an article, but more some notes about how we use dbt in our team.
253. Five Undervalued Data Points for Emerging Businesses
Apparently, data has become more ubiquitous than the stars in the sky. In fact, the amount of data produced daily via the Internet is set to top 44 zettabytes. As you might assume, that’s more data than you could possibly fathom or use.
254. Choosing A Colocation Data Centre That’s Right For You
Data and computer systems are at the heart of most companies, which is why it is paramount that where you store your IT infrastructure meets your needs.
255. What is a 'Data Fabric'?
A Data Fabric is a mix of architecture and technology that aims to ease the difficulty and complexity of managing several different data types.
256. Hacking Your Marketing Campaigns With Data Science
There is a ton of data points generated from each of your business activities today. A simple email blast to a few thousand recipients generates data pertaining to the open rates, click-through rates and conversion. These data points can further be distilled to infer specific information about the audience demographics that find your message appealing, the subject lines that trigger the user to open your emails, the CTAs that work, and so on.
257. How to Build a Web Scraper With Python [Step-by-Step Guide]
On my self-taught programming journey, my interests lie within machine learning (ML) and artificial intelligence (AI), and the language I’ve chosen to master is Python.
258. 4 Best Data Recovery Tools For SD cards, USB Drives, and Hard Drives
Oh no! I lost all my vacation pictures. What do I do now? Is it possible to recover all the deleted files from the SD card? Will I ever get to see my photos from the vacation again?
259. Protecting Yourself From CEO Fraud
Yesterday a friend of mine called me sharing his CEO mailed him asking for his personal financial details with CTA on a shortened URL. He was about to click the
260. What Can Recurrent Neural Networks in NLP Do?
Recurrent Neural Networks (RNN) have played a major role in sequence modeling in Natural Language Processing (NLP) . Let’s see what are the pros and cons of RNN
261. Data Virtualization: How It Works And What Benefits We Can Get From It
In the healthcare sector, data virtualization (DV) is gaining traction. It's still a hot subject, with many leading industry experts hailing it as a game-changer.
262. 🐱 How To Create Your Own Virtual World With CoderDojo☄️
This summer, we will be hosting a virtual CoderDojo meetup via Zoom. 🍿 Noblesville High Schooler Anna 👩🏻🦰 selected this lesson for her summer CoderDojo project.
263. The Operational Analytics Loop: From Raw Data to Models to Apps, and Back Again
Over the next decade or so, we’ll see an incredible transformation in how companies collect, process, transform and use data. Though it’s tired to trot out Marc Andreessen’s “software will eat the world” quote, I have always believed in the corollary: “Software practices will eat the business.” This is starting with data practices.
264. Interpretation of Visualizations of Soil Data and Weather APIs
Learn how to visualize and interpret weather APIs and soil data in different graphs using python libraries, and Google Collab.
265. A New Netflix Style Reality Show for People Who Love Data
Seven data professionals gear up to analyze and visualize one of the largest and robust datasets out there to win the title - The Iron Analyst!
266. How to Become a Private Home Trader and 10 Tips to Help You Get There
Trading is a booming sector that today attracts many people. Here are 10 tips to help you succeed as a trader.
267. Data Organization – The Great Differentiator in the Digital Era
In business, efficient processes can make or break an organization. If processes are not executed properly, companies lose time, money, and damage their reputation.
268. A Data-Centric Perspective of ChainLink’s Madness
The crypto market might seem incredibly boring to traders these days with Bitcoin and Ethereum behaving like stablecoins 😉. However, a handful number of crypto-assets are showing an atypical momentum non-correlated with the rest the space. Among those crypto-assets, none is capturing the imagination of crypto-speculators like ChainLink. In the last few days, ChainLink has been regularly hitting all-time highs despite challenging the lack of momentum of the top crypto-assets.
269. Five Data Quality Tools You Should Know
Enterprises ensure their data is accurate, consistent, complete, and reliable, by relying on data quality tools
270. An Internal Email to Tim Cook and the State of Business Intelligence
We get a glimpse into the inner workings of a valuable company and it turns out it's not all sunshine and rainbows.
271. Busting Data Science Myths: "You Need a PhD, Extensive Python Skills, and Tons of Experience"
DJ Patil and Jeff Hammerbacher coined the title Data Scientist while working at LinkedIn and Facebook, respectively, to mean someone who “uses data to interact with the world, study it and try to come up with new things.”
272. 3 Reasons To Connect Data Silos To A CDP
In the current digital age in which we live, population data is of exceptional value. And so, when the data of a company's customers are in unconnected silos, they can be a barrier to success.
273. Are MySQL replications as smooth as you think they are?
What are you actually missing out on in MySQL replication? It appears easy, but to debug the problem caused by it takes a lot of time. So, here's your answer.
274. Online Privacy is Not an Option: It's a Necessity
How the challenge of protecting personal information online led to data protection and privacy laws in the EU and U.S.
275. Data Engineering Tools for Geospatial Data
Location-based information makes the field of geospatial analytics so popular today. Collecting useful data requires some unique tools covered in this blog.
276. What Are The Challenges of Monetizing and Selling Data?
There have been great advancements in monetization opportunities in the last decade, but there are still challenges when it comes to generating big data analyti
277. Poor Data Quality is the Bane of Machine Learning Models
An examination of the importance of data quality, how it can present itself in a dataset, and how it can impact machine learning models.
278. How to Define Data Analytics Capabilities
Disclaimer: Many points made in this post have been derived from discussions with various parties, but do not represent any individuals or organisations.
279. Hospital Websites are Giving Facebook Sensitive Information
A tracking tool installed on many hospitals’ websites has been collecting patients’ sensitive health information—including details about their medical condition
280. Explore Different Ways You Can Use Data Visualization to Help Your Nonprofit
Data Visualization can be a crucial tool for your nonprofit. Figure out when to use and how to use it to improve your organization.
281. Four Novel Machine Learning Methods for Analyzing Blockchain Datasets
Using machine learning to analyze blockchain datasets is a fascinating challenge. Beyond the incredible potential of uncovering unknown insights that help us understand the behavior of crypto-assets, blockchain datasets presents very unique challenges to a machine learning practitioner. Many of these challenges translate into major roadblocks for most traditional machine learning techniques. However, the rapid evolution of machine intelligence technologies has enabled the creation of novel machine learning methods that result very applicable to the analysis of blockchain datasets. At IntoTheBlock, we regularly experiment with these new methods to improve the efficiency of our market intelligence signals. Today, I would like to provide a brief overview of some novel ideas in the machine learning space that can yield interesting results in the analysis of blockchain data.
282. Reasons Why Data Privacy Matters
Data privacy is one of the hottest topics in tech conversation. But what's the deal with it? Is it good? Is It bad? Keep reading to find out.
283. Open-Source Intelligence (OSINT) Use by Governments
In the 1980s, the US military first coined the term ‘OSINT’. Since then, the dynamic reform of intelligence has been beneficial in many different scenarios.
284. Leveraging Data Analytics to Improve Patient Adherence
Role of of pharma analytics to enumerate the factors accountable for falling medication adherence and the increasing role of data analytics and machine learnin
285. 10 Reasons to Get Your Cybersecurity Certification
The set of skills that are mostly expected by the employers can be gained by the cybersecurity certifications, it will prepare you for the diversity needed in the sophisticated areas of cybercrime. So, here are the top compiling reasons for you to pursue the additional cybersecurity credentials.
286. What Should I Do After the Data Observability Tool Alerts Me
We need to start building the best practices across the ecosystem to maximize the value of data observability.
287. Hospitals Remove Facebook Tracker but Questions Still Remain
Meanwhile, developments in another legal case suggest Meta may have a hard time providing the Senate committee with a complete account of the health data.
288. Handling Data Integrity Issues Like a Pro
What do you do if an API you reference sends 200 - OK but an error message? What do you do when a critical column is missing from your Excel upload? Read me.
289. Debezium Introduction: Another Change Data Capture Tool
Building an enterprise data warehouse can be either relatively straightforward or very sophisticated. It depends on many factors, such as the conceptual data model complexity and the variety of source systems. In many cases, applying the Change Data Capture (CDC) approach can make the data integration simpler. Fortunately, there are plenty of CDC tools available in the market, many of which are easy-to-use and affordable, while others are cumbersome and expensive (for what it is).
290. How to Use Tableau Visualization to Make a Covid Risk Model
In this paper, I used data from two different data sources and merged them together in the Tableau layer to perform the data analysis.
291. How a Data Scientist Sees a Deck of Cards
The Data Scientist Creativity Paradox
292. 'Experience is a Double-edged Sword': Kyle Kirwan, CEO of Bigeye
An interview with the founder and CEO of Bigeye, a data observability platform.
293. Top 7 JavaScript Pivot Widgets in 2022
Pivot Charts are useful tools that can be relied on to visualise huge amounts of data. These 7 JavaScript Pivot Widgets are some of the best ways to use them.
294. 3 New Startups That Are Innovating DeFi Data Analysis Technology
Data analysis as a whole is one of the most important industries. Now that DeFi is a full-fledged industry, there is a growing need for valuable data analytics.
295. GraphQL, GraphQuill, and You
Let’s start with the idea of a database, and a basic query. Server taps the database. Server brings back persistent state information that allows an application to update, and maybe a GUI. What’s wrong with this picture? Not much at first, of course. A GET request is the anchor of RESTful architecture, and in some ways the anchor of the web. So basic that fetch syntax defaults to that type.
296. How to Create a Data Analytics Strategy to Grow Your Business
Are you building a Software-as-a-Service platform? Wondering what data is essential for your business? Time for a Data Analytics Strategy.
297. Practical Tips to Improve Customer Experience with Data
According to a report, almost 70% of companies compete on customer experience.
298. Save API Costs With Data-Centric Security
APIs are quickly becoming the front door to modern enterprises. But the API paradigm also comes with various hidden costs around development, management, etc.
299. Why Linux-Based Brands Are So Desirable
I'll start off by dating myself... it was the year 2000. I was in college and the brand new Mini Disk MP3 player had just come out. Superior audio to CD's and the ability to hold hundreds of songs on 1 little disk. Being a broke college kid, it took me about 6 months to make the purchase. Just when I got used to looking cool with my MD player, a wild flash of cool came across the analog airways via a commercial from a company that was only recently regaining its cool with a crappy multicolor desktop PC called the iMac. Of course, I'm talking about Apple. The product was the iPod. I was defeated and nearly threw away my MD player on the spot.
300. Moving From the Flat Earth: Why We Should Switch to Data-Driven Finance
Businesses should switch from linear formulae to data-driven finance. This will allow companies to not only get an immediate revenue boost!
301. How Do You Hack Data Structures and Algorithms? Teach Us Sensei!
Software Engineers are always on the lookout for better, more efficient ways to solve problems.
302. Things to Consider When Looking For Data Science Roles
There is a great demand for data scientists presenting market dynamics that are favourable for the community. More so than your peers in other professions, you will be able to evaluate a company for what it is able to offer you, rather than solely being the one that is being evaluated. So what should you look for when comparing and evaluating data science roles? Here is a list of some commonly known factors plus some less discussed ones that will help you in your evaluation.
303. How To Use Change Data Capture for Fraud Detection
Still relying on overnight processes to drive your decision making? Maybe it’s time to consider an evaluation of your CDC pattern that uses new technology.
304. Common RAID Failure Scenarios And How to Deal with Them
Most businesses these days use RAID systems to gain improved performance and security. Redundant Array of Independent Disks (RAID) systems are a configuration of multiple disk drives that can improve storage and computing capabilities. This system comprises multiple hard disks that are connected to a single logical unit to provide more functions. As one single operating system, RAID architecture (RAID level 0, 1, 5, 6, etc.) distributes data over all disks.
305. The Burgeoning Global Surveillance State - What's Going On?
What is a surveillance state? Privacy International defines it as one which “collects information on everyone without regard to innocence or guilt” and “deputizes the private sector by compelling access to their data”.
306. How Is Data Automation Transforming The Workplace?
Every minute, a staggering 1,820 terabytes of data is created around the world. That’s more than 2.5 quintillion bytes every day!
307. Create A Data Visualization Map Using Mapbox
In this article, we make a map with a software called Mapbox in a few simple steps. This won't involve any coding at all!
308. ANSI X12 EDI Basics: A Guide to the ANSI X12 Standards
ANSI X12 EDI is one of the most important concepts that you must be aware of prior to implementing EDI in your organization.
309. Leveraging Data Science in eCommerce: 7 Projects to Try
As an online retailer, how can you improve your business? Of course through providing a better customer experience. An e-commerce company needs to have a well understanding of the following factors:
310. 4 Data Transformations Made Spreadsheet-Easy
Gigasheet combines the ease of a spreadsheet, the power of a database, and the scale of the cloud.
311. 6 Keys to SaaS Security Posture Management
You're not doing everything you can to protect your SaaS environment if you're skipping one of these: 1. Security policy enforcement, 2. Regular configuration..
312. Live on the Edge of Computing
Data is the lifeblood of any application and any business venture.
313. 5 Ways to Store Market Data: CSV, SQLite, Postgres, Mongo, Arctic
What's the most efficient way to store market data? SQL or NoSQL? Let's compare 5 most common options and find out what is best.
314. What is a Minidump?
Adding minidump support came with a number of technical challenges that we had to address.
315. The Importance Of On-chain Analysis
A look at the importance of on-chain anlysis
316. 7 Q&As About Memory Leaks
317. The Future of the Internet Through the Web 3.0 Lens
Jules Verne, John Brunner, Arthur Clarke, William Gibson, George Orwell — it’s a short list of writers who predicted the future in their books. They’ve written about social and technical changes that will take place in human society. Here we are, facing those changes good or bad.
318. Sberbank-Owned RuTarget Harvested User Data for Months via Google
Google may have provided Sberbank-owned RuTarget with unique mobile phone IDs, IP addresses, location information and details about users’ interests and online.
319. How to Clean and Verify Address Data 'Without Using Code'
Today, data verification has become one of the greatest assets of an organization.
320. 4 Ways Cities Are Utilizing Data for Public Safety
Cities have been using data for public safety for years. What new technology is emerging in public safety, and how does it affect you?
321. Improving Customer Experience Through Personalization With Predictive Analytics
The development of smartphone and computer technologies, and the internet in general, have influenced customers’ default behavior and expectations.
322. Football Data Analysis Using Machine Learning Models Can Potentially Boost Throw-Ins!
“Can machine learning models help improve ball accuracy, precision and retention, leading to scoring after throw-ins?
323. How to Make Rough Estimates of SQL Queries
To do estimates of SQL queries we need to understand how DB works with queries. Let's find out what exactly the db do with queries.
324. Why FHIR Capabilities of Healthcare Data Platform is Critical to Quality and Cost of Care Delivery
The flexibility of interoperability in the healthcare system has enhanced patient-doctor interaction to a great extent.
325. Not So Fast: Valuable Lessons from the FastCompany Hack
When FastCompany's website was hacked recently, it sent shockwaves through the media world, underscoring the importance of routine cybersecurity hygiene.
326. The Power State of Dark Data.
Have you ever heard of “Dark Data”?
327. Using Data Science To Deal With RTOs
Considering how much fraudulent RTOs can cost a business, using data science to mitigate their frequency can help save an e-commerce business money over time.
328. How to Combat Reader Fatigue to your Content Marketing Campaigns
Does your content generate more yawns than leads? Is your content just another copy of 100 similar articles clogging up the search engines? Hopefully not, but even if you think your content has an impact, there is always room for improvement.
329. Data Analysts As Arbiters of Truth
Coming out of college with a background in mathematics, I fell upward into the rapidly growing field of data analytics. It wasn't until years later that I realized the incredible power that comes with the position.
330. How Government Agencies Flex Their Data Science Muscle
From NASA to the NSA, data science is being employed by the governments of every major country to inform policy, provide public services and, in some cases, surveil ordinary people. In the United States in particular, it underpins many of the public sector’s most important functions, whether we citizens are aware of it or not.
331. CivicGraph: An Open Source Versioning Data Store for Time Variant Graph Data
I would like to introduce an open source, Apache 2.0 licensed project of mine: https://github.com/CivicGraph/CivicGraph
332. Microsoft's Power BI for Business: Features, User Experience And Pricing
All you need to know about Power BI Features, Benefits, Use Cases and pricing. A Comprehensive Guide to Power BI and how it stacks up to Microsoft Excel
333. 3 Types of Anomalies in Anomaly Detection
An Introduction to Anomaly Detection and Its Importance in Machine Learning
334. These Shifts Will Shape The Future Of Data Centers
According to Gartner, the spending on data centre infrastructure is supposed to grow 6% in 2021 after a steep decline of 10.3% in 2020.The reduced demand in data infrastructure is expected to come back in 2021 once the workforce gets back to the site, according to Naveen Mishra, a senior research director at Gartner.
335. How to Connect to Salesforce Data in AWS Glue Jobs Using JDBC
Connect to Salesforce from AWS Glue jobs using the CData JDBC Driver hosted in Amazon S3.
336. 10 Best Tactics For Your WooCommerce Store Security
WooCommerce is a great plugin for WordPress to build an online store. With an entire eCommerce ecosystem and a dedicated global community, it has achieved the reputation of an industry standard. Still, this doesn’t mean that nothing c go wrong, especially if you ignore essential security precautions. Here are ten tips on how to make your business (and your customers’ data) safe.
337. Why I Spent Years Writing a Children’s Book on Data Science
I wrote a children's book on data science to inform others who have a hard time understanding data science and machine learning concepts, especially kids!
338. How Big Tech Influences Privacy Laws
The Markup reviewed public hearing testimony in all 31 states that have considered consumer data privacy legislation since 2021 and found a campaign by Big Tech
339. What is Data Collection and What are The Most Important Events to Track
When your company is client-oriented, one of your priority tasks is understanding your clients’ problems and gathering insights on how people use your product and when exactly they benefit from it.
340. This Online Abortion Pill Provider Used Tracking Tools That Gave Powerful Companies Your Data
The trackers notified Google, Facebook’s parent company Meta, payments processor Stripe, and four analytics firms when users visited its site.
341. Creating a Dependable Data Pipeline for Your Small Business
In this article, I will be showing you how to build a reliable data pipeline for your small business to improve your productivity and data security.
342. Apache Airflow: Is It a Good Tool for Data Quality Checks?
Learn the impact of airflow on the data quality checks and why you should look for an alternative solution tool
343. AI-enabled Smart Cities: What to Get Right
Data is the foundation of smart cities. However, to deliver the right solutions, planners must establish sustainable data and AI technology policies.
344. Tired of Dirty Data? It’s Time to Implement a Data Scrubbing Initiative
Raw data coming in from various sources is often inherently dirty data, rife with factual errors, typos and inaccuracies. Left unattended, this data becomes a nightmare. Imagine having to pull a report only to realize it has duplicated data – not to mention half of them don’t even have valid phone numbers or addresses. Your boss is not going to be happy.
345. Ultimate Guide to React Data Grid And its Mind-blowing Features
Indisputably, React is always the first choice for front-end web developers and simultaneously Data Grid is also the priority for the visual software elements since the evidence of UIs themselves.
346. An Intro to SQL for Data Scientists
The importance of SQL and how to go about learning it
347. Designing a Website for Data
It’s complex to create the right design when the only visuals you have are based on data. Here’s how we did it.
348. Data Engineering Hack: Using CDPs for Simplified Data Collection
From simplifying data collection to enabling data-driven feature development, Customer Data Platforms (CDPs) have far-reaching value for engineers.
349. Why RPA Is Attractive to Manufacturers
As Kevin Ross from CyberArk puts it, “RPA is one of the hottest technologies in the IT market today, mainly due to its potential to deliver huge benefits to companies.” Many businesses are rushing to close the gap between production and demand for their services, especially after the COVID-19 disruptions.
350. ModelOps Introduction Series: Data Preparation In AI
No one should be doing AI for the sake of doing AI—it must be tied to a clear business objective. All AI applications stem from mining data, whatever the aim.
351. The Hyper-V Admins' Guide to VMware Backup
The Hyper-V and VMware virtual environments may seem similar, but upon closer inspection, a number of important differences between these two platforms.
352. Attorney Adam Williams Gives Advice on Data Breaches and Class-Action Suits
Technology is changing incredibly quickly, and as a consumer, it can often be difficult to stay on the front lines of keeping yourself secure while using all of the technological marvels that the world has to offer. Data breaches and privacy breaches are reported every day in the news, making the private information of thousands vulnerable to identity theft and the consequences that come with it.
353. The Usefulness Of Data Science In Law Enforcement
Law enforcement agencies are not new to the data and its usage, but with the advancement in technology, Data science in law enforcement has become a need.
354. What Working on an Analytics Product Can Teach Us About Data
The ubiquity of analytics hides potential complexity underneath, especially when you start to consider products where the analytics are more front and centre.
355. What is a Data Reliability Engineer?
With each day, enterprises increasingly rely on data to make decisions.
356. Data Sovereignty: The Importance of Keeping Your Data Safe
Protect your personal data with data sovereignty. Learn the importance of keeping your information safe and secure in the digital age. Read our article to find
357. Open Source is Much More Than Just a Free Tier - Here's Why
This article explores the reasons why open source has been so successful, the areas where it has not, and the differences with free tiers of software.
358. Nginx Logs - Fair Database Benchmarks
How one test works to analyse millions of Nginx logs from a live website and what to learn from the analysis results while processing it in a timely way.
359. A Look at the Trends in Developer Jobs: A Meta Analysis of Stack Overflow Surveys
I'm really interested in the trends we see in the software engineering job market.
360. How to Build Cloud-Based Data Architectures
Building Cloud-Based Data Architectures is necessary to making use of Big Data and gaining the ability to process significant amounts of data for analysis.
361. Data Teams Need Better KPIs. Here's How.
Here are six important steps for setting goals for data teams.
362. How Bad Data Will Ruin Your Account-Based Marketing
Making account-based marketing decisions and executing ABM campaigns based on inaccurate data can lead to wasted time and resources. Here's how to avoid that.
363. The Noonification: How Often Do NFTs Pass The Howey Test? (1/13/2023)
1/13/2023: Top 5 stories on the Hackernoon homepage!
364. How Big Data is Keeping Employees Engaged in the Age of WFH
Big data is beginning to emerge as a key tool for businesses to successfully operate on a WFH basis.
365. The Benefits And Core Processes of Data Wrangling
This article examines the process and methods of data wrangling: preparing data for further analysis by transforming, cleaning, and organizing it.
366. Grafana Loki: Architecture Summary and Running in Kubernetes
Grafana Loki logging system architecture and components, its setup in Kubernetes from the Helm chart with AWS S3 as Single Store and boltdb-shipper for indexes.
367. How Tencent uses Prometheus and Grafana to Set Up a Monitoring System in 10 Minutes
This blog will introduce how Tencent uses Prometheus and Grafana to set up monitoring system for data platform.
368. Investors Clamor for Digestible Data Analytics in the Fledgling Crypto Industry
As DeFi data generation grows with the industry, there is an increased need for platforms that are able to digest and analyze this data for investors.
369. The Healthcare Revolution: How The Metaverse Can Transform Traditional Industries And Improve Them
How new technology from the Metaverse and Web3 can help improve the healthcare industry by improving training, making better tools and making processes better.
370. Big Data's Influence on Decision Making in the Healthcare Industry
Big data is transforming decision-making in healthcare and this article explores how it can be used to improve patient care, as well as its challenges.
371. Who is Life360 Selling Your Data to?
Life360, a safety app, sold data to Cuebiq and X-Mode, two location-data brokers.
372. How to Build a Versatile Traverse Function from Scratch
Learn how to create your own traverse function in under 5 minutes.
373. How to Keep Mission-Critical Business Data Secure in the Mobile Age
Andrew Nichols | Protecting Mission-Critical Business Data in the Mobile Age
374. Everything You Need to Know About Deep Data Observability
What's Deep Data Observability and how it's different from Shallow.
375. Your Children's Data Is Being Collected by Educational Technology Companies
Vista Equity Partners is collecting data on children that go to school.
376. Improve Your AI Training Data Using Self-Agreement Protocols
Finding, creating, and annotating training data is one of the most intricate and painstaking tasks in machine learning (ML) model development. Many crowdsourced data annotation solutions often employ inter-annotator agreement checks to make sure their labeling team understands the labeling tasks well and is performing up to the client’s standards. However, some studies have shown that self-agreement checks are as important or even more important than inter-annotator agreement when evaluating your annotation team for quality.
377. API Integration in 2021 and a Look at Insights from 2020
What's next for API integration? Share your thoughts in Cloud Elements' industry survey to win prizes and contribute to the industry-renowned report.
378. Everipedia: Online Encylopedia With Over 6 Million Articles
My name is Matthew E. O'Neil. I have made over 9,000 edits for Everipedia which makes me the #1 ranked editor all time for the company. I graduated with a B.S. from Colorado Christian University in 2001 and a M.B.A from Regis University in 2004. Both universities are located in Colorado. As of May 2020,I am ranked #1 all time in the number of edits at Everipedia under the user name 'ore1pnq21bfu'. I am in contact with boxing promoters Top Rank, Don King Productions and Mayweather Promotions. I have been in contact with TMZ and Mick Magsino of Reuters. I competed for Clark County Commission in 2006 against Rory Reid and lost. Reid was later accused of campaign finance infractions but was never formally charged. I was interviewed by Tony Cooke of the Las Vegas Sun when I ran for Clark County Commission.
379. 4 iPaaS Use Cases for 2023
iPaaS products and providers can help integrate data and applications between the cloud and businesses. Here are some compelling ways to use iPaaS solutions th
380. Supporting 'Citizen IT': It’s Critical to Democratize Your Data
Democratizing data to enable Citizen IT provides a competitive advantage to organizations - here's why.
381. People Intelligence Platforms: Everything You Ever Wanted To Know
You may have heard of People Intelligence platforms—albeit these solutions you may have encountered focused very much on the pre-hire stages of your employee experience versus the day-to-day post-hire segment of their experience. People Intelligence can be defined as the combination of strategies and technologies used by organizations for statistical and data-driven analysis of performance, productivity, and business information. People Intelligence platforms can make sense of disparate people-generated data by transforming the information into actionable insights that impact a business’ operations.
382. How to Train Computer Vision Models Efficiently
The starting point of building a successful computer vision application is the model. Computer vision model training can be time-consuming and challenging if one doesn’t have a background in data science. Nonetheless, it is a requirement for customized applications.
383. Why Normalizing Your Database Is Just Like Organizing Your Closet
Explore database normalization in an easy-to-understand way using pet store examples. Discover the different levels from 1NF to 3NF and the importance of data
384. Hacking the Path to Expanded Credit Access
A look at how previously unused alternative data may improve the ability for millions of Americans to get a credit score.
385. An Intro to Analytics Tools for Video Streaming Business
After you have invested your resources and time in making a product, you naturally like to know what people think of it. You would also like to be wise enough to collect all the information you can about who is using it and how they are using it so you can make the product a lot better.
386. European User Data is Shared 376 Times Per Day on Average
Violation of private data and its commercial exchange are recurrent issues in the online world. In this thread, our community discusses personal data share.
387. The Simple Way To Empower Data Science Teams With Data
How can you empower your data scientists to help your product strategy teams? Give them a lot of high-quality product data.
388. The Best Options to Store Data and Keep it Safe Forever
As technology evolved the options for storing the large amount of data have also changed. In this article we've discussed the terms Archiving and Backups.
389. About the Wright Brothers Journey to Accurate Wind Tunnel Data
I originally published this story for the Atlan Humans of Data publication.
390. Optimize Your Data Engine With Data as a Service (DaaS) and Multi-Tenancy
Data-driven organizations are planning to build a data as a service (DaaS) architecture to make it easier to onboard their users, partners, and consumers.
391. Scaling Off-Chain Data and Computation for Smart Contracts
As storing information on the blockchain becomes more popular, the availability of smart contracts becomes more widespread. They behave according to established parameters, automatically letting events happen once specified conditions are met.
392. Using Real-Time Data in Digital Marketing
Learn how you can use real-time data in digital marketing for customer engagement and retention, analyze real-time data for faster decision-making
393. Building the Next-Generation Data Lakehouse: 10X Performance
How to connect various data sources easily and ensure high query performance.
394. Your Planned Parenthood Data Is Being Collected, Analyzed, and Sold
The Markup has found that location data company INRIX, which collects, and sells aggregated vehicle, traffic, and parking data includes Planned Parenthood.
395. Estimating Price Elasticity with Machine Learning
Using machine learning, multi-linear regression, and scikit-learn to estimate price elasticity for wine products.
396. Why Did Twilio Acquire Segment for $3.2 Billion? To Better Understand End User Data
This week the API-based communications platform giant Twilio formally announced that it would acquire the customer data platform startup Segment for $3.2 billion in all stocks. The market responded quite positively to the news, driving Twilio’s stock price up by 7.7%, bring Twilio’s market cap to a staggering $49 billion.
397. Bypassing Enterprise Data Encryption Policy with Metadata [A How-To Guide]
A few companies I've worked for have an IT policy on their secure computers designed to stop movement of sensitive data outside the enterprise. This policy encrypts all file data being written to removable media (USB drives, external hard drives, etc.) such that only a computer within the same enterprise can decrypt and read the data.
398. How to Achieve Optimal Business Results with Public Web Data
Public web data unlocks many opportunities for businesses that can harness it. Here’s how to prepare for working with this type of data.
399. Big Data Analysis for the Clueless and the Curious
Big data analytics has been a hot topic for quite some time now. But what exactly is it? Find out here.
400. Illusion of Choice: You Aren't Deciding How Important Your Privacy Is
The above statement is easily the most eloquent justification of privacy that I've seen. Thanks in large part to Snowden, the past decade has seen large parts of society become serious about data privacy, but it still feels like an overwhelming number of people can't be bothered to give this issue even a second of their time.
401. $15M Lost due to low Quality Data
Many organizations today are plagued by poor data quality management. Don't make the same mistakes.
402. Introducing RecallGraph (formerly CivicGraph)
I had posted earlier about an open source temporal graph database that I have built, named CivicGraph.
403. How to Grow your Video Business with Data
TV watching used to be a family affair a decade ago, but today in most households, content watching has become a personal activity.
404. Hacking Your Analytics: Top Barriers in Harnessing the Power of Data
An infographic to take a look at how to use more of your organization's data with Google Analytics 360 to form solid data based business decisions proactively.
405. 5 Things to Watch Out for When Implementing Tableau BI
Has your organization decided to adopt and implement the Tableau BI platform, namely its Tableau Server and Tableau Online versions?
406. Our Data-Driven Approach to Making Sense of the 2020 Presidential Election
In less than five months, the world’s attention will be drawn to the outcome of the US Presidential election.
407. 6 Tips for Working With Analysts and Data Engineers
What work does a data engineer actually do? Let me tell you one thing: it’s not what you think they should be doing, especially not the part where they are running around collecting data for you or building yet another one of those dashboards that will only be used for a few weeks.
408. An Intro to Real-Time Data Anchoring and Legacy ERP Systems
Comprehensive data visibility is still a big challenge in enterprise resource planning.
409. Analyzing 110 Million Comments from Hacker News
In this article, we’ll observe another test with1.1M Hacker News curated comments with numeric fields
410. We Discovered Disparities in Internet Deals: Here's How We Did It
Yet the high-speed internet options offered by an internet service provider can vary even by neighborhood within a city...
411. Context Matters in Semantic Ambiguity
If we assume all items in the list above have the same semantic value, what is it exactly?
412. How to Fetch Data from APIs Using useEffect React Hook
In this article, we will take a look at useEffect React hook to fetch data from an API. We will create a sample React application to pull data from the provider and use it in our application.
413. 4 Data Analytics Certifications That Boost Your Career
The best data analytics certifications, which will provide you with the right kind of guidance to boost your big data analytics career and to get a great job.
414. "Will the Stock Market Reset After the Election?" asks Frederik Bussler
Frederik Bussler, on a mission to democratize data science, has contributed an impressive 27 days, 7 hours, and 41 minutes of reading time to Hacker Noon, with stories on everything from AI and no code to strategic thinking and financial mythbusting. Scroll down to learn more about this prolific contributor.
415. Fetch.ai Releases DabbaFlow: Encrypted File Sharing Platform for Secure Data Transfers
DabbaFlow, an end-to-end encrypted file-sharing platform developed by Fetch.ai, a Cambridge-based artificial intelligence lab, was launched recently.
416. 8 Crucial Tips for Hardening PostgreSQL 14.4 servers in 2022
As of July 13th, 2022, there are 135 security flaws reported to the CVE database. Here are 8 essential measures you can take to protect your PostgreSQL server.
417. Why Governments Can’t Stay Away from Blockchain
“The blockchain cannot be described just as a revolution. It is a tsunami-like phenomenon, slowly advancing and gradually enveloping everything along its way by the force of its progression.” In these words, William Mougayar, one of the greatest proponents of blockchain, praises the colossal impact of the decentralized ledger on everything.
418. COVID-19: Perceived Spread vs. True Spread in China, Italy and the US
Here at TimeNet, we’re building a large time series database with the primary aim of benefitting society through access to data. In this post we’ll study different time series representing both the true, and the perceived spread of the coronavirus (COVID-19) pandemic. Daily COVID-19 numbers are currently available on TimeNet.cloud for many countries. We’re expanding these datasets with further variables measuring how we (people) perceive the significance of the pandemic. We use stock market movements and internet search trends to quantify the virus’s perceived spread.
419. Please Dont Build Your Data Pipeline using Singer
Singer.io is an open-source CLI tool that makes it easy to pipe data from one tool to another. At Airbyte, we spent time determining if we could leverage Singer to programmatically send data from any of their supported data sources (taps) to any of their supported data destinations (targets).
420. Exploring Ethical User Data with Datawallet’s New SDK
On July 8, the Information Commissioner’s Office (ICO) announced the highest GDPR fine ever of £183 million over last year’s data breach at British Airways. The UK’s data watchdog elected to fine the airline as its “poor security arrangements” led to the breach of credit card information, names, addresses, travel booking details, and logins of around 500,000 customers. In recent years, consumers have become wearily accustomed to data breaches of this magnitude.
421. The Data Revolution: Green, Accessible, and Fully-Decentralized
it's time to make data sexy again. Let's revive the dream of a truly neutral Internet, from the people for the people with wide accessibility and no censorship
422. What You Need to Know About Python’s Data Model
A Concise Overview of Data Model, Special Methods and the Collection API in Python.
423. How Different Industries Put Data Analytics to Use
You must have heard about big data and the theory used behind it. However, are you aware of the top industries where data analytics is being used for changing the way we work in the actual world? Let's take a close look at the top big data industries and how they are getting reshaped by using data analytics. The main idea behind using big data is that it is a new method for gaining insight into the challenges faced by various companies each day. In earlier days it was not possible to collect and interpret a vast quantity of data because there was no technology available.
424. How Different Analyst Types Can Positively Impact Your Small Business
Data analysis used to be considered a luxury of big business.
425. The Mass Storing of Data Can Turn The Consumer into The New Farmer
Google’s cloud is an interesting example of how information flips the supply chain upside down.
426. Integrate AI into Data Mapping to Drive Business Decision Making
Prior to analyzing large chunks of data, enterprises must homogenize them in a way that makes them available and accessible to decision-makers. Presently, data comes from many sources, and every particular source can define similar data points in different ways. Say for example, the state field in a source system may exhibit “Illinois” but the destination keeps it is as “IL”.
427. A Look at COVID’s Impact on Data Privacy and Protection
After more than a year into the pandemic, it’s clear that COVID-19 will have lasting impacts. As companies rapidly embraced digital transformation, data privacy and protection have seen some of the most significant changes. COVID data risks and policies will likely far outlast the virus itself.
428. Introduction to Web Scraping: Parsing Craiglist with Java
Web scraping or crawling is the fact of fetching data from a third party website by downloading and parsing the HTML code to extract the data you want.
429. Lying to the Blockchain: Applying The Garbage In, Garbage Out Problem to Decentralized Networks
In this article, we address a notion that is often overlooked (mostly, intentionally) of how real-world data interacts with blockchains.
430. Itheum: The Convergence Of Data, Web2, And The Metaverse
Exclusive interview with Mark Paul, founder of Itheum, a project making game-changing advancements towards data privacy by integrating it with the metaverse
431. Las 15 preguntas más frecuentes sobre Web Scraping
Previously published at https://www.octoparse.es/blog/15-preguntas-frecuentes-sobre-web-scraping
432. MODEL-CENTRIC vs DATA-CENTRIC Approaches in Machine Learning
Machine learning is an area of artificial intelligence (AI) and computer science that focuses on using data and algorithms to mimic the way humans learn
433. Overfitting in Financial Model Building
Creating a powerful predictive algorithm usually involves a certain amount of hyperparameter optimization. This involves tuning a model’s parameters to maximize a certain objective function, such as the Sharpe Ratio in finance. One of the most popular methods is Bayesian optimization, which is a significant improvement in computational efficiency and results over both random search and grid search — two other popular ways of optimizing hyperparameters. When evaluating a costly black-box function, Bayesian optimization is by far the most popular method for tuning hyperparameters.
434. Solving Data Integration: The Pros and Cons of Open Source and Commercial Software
There was an awesome debate on DBT’s Slack last week discussing mainly two things:
435. Ensuring Privacy with Zero-party Data
Zero-party data is the future of data collection because it bridges the gap between advertising needs and consumers’ concerns about privacy.
436. Using Data Analytics Effectively in Marketing
How to make your data work harder for you in marketing
437. HIPAA Does Not Prevent Period-Tracking Apps From Selling Location Data
The Health Insurance Portability and Accountability Act, the federal patient privacy law known as HIPAA, does not apply to most apps that track menstrual cycles
438. The Missing Link: Why Are Linked Lists Useful in Software?
I like to start off Metaphysically then move down into the specifics of something. So then why are Linked Lists useful in software? Well I’ll answer with the question “Why have belongings if you have no place to store them?”; What good are your belongings if you cannot keep them anywhere?
439. Combating Human Trafficking With Collaborative Data Collection
Christi Wigle, co-founder, CEO of the nonprofit United Against Slavery, helped to identify a need for collaborative data collection in the anti-traffic space.
440. Data Will Never Be Clean But You Can Make it Useful
Understanding how to clean data is essential to ensure your data tells an accurate story
441. How IoT Platforms can Control Digital Transformation 2.1
IoT platform technologies enable traditional businesses to transition to become digital-based businesses.
442. How DAGs Grow: When People Trust A Data Source, They'll Ask More Of It
This blog post is a refresh of a talk that James and I gave at Strata back in 2017. Why recap a 3-year-old conference talk? Well, the core ideas have aged well, we’ve never actually put them into writing before, and we’ve learned some new things in the meantime. Enjoy!
443. Why I Decided to Bring a New Cloud Data Warehouse to Market
So we’ve all heard that “data is the new oil” way too many times. It’s been said so often that I personally feel slightly nauseous every time someone says that (sorry).
444. 9 Dropbox Alternatives for Evidence Management in Law Enforcement
For evidence management and storage, there are multiple alternatives to Dropbox available which can save time and be customized for your specific needs.
445. Where Visuals And Algorithms Collide: How Unrelated Algorithms Produce Intuitive Markings
A nautilus seashell with a perfect spiral is the product of specific DNA that coded for its existence.
446. Is the Pinterest Way to Measure Ads the Right Way?
A cutting-edge data science model can only be created if impact is measured properly. Pinterest upgraded everyone's preferred impact measurement metric, CTR.
447. An Introduction to Data Connectors: Your First Step to Data Analytics
This post explains what a data connector is and provides a framework for building connectors that replicate data from different sources into your data warehouse
448. Don't Be Data-Driven. Become Purpose-Driven and Data-Assisted.
449. How The Heck Did Robinhood Become So Popular? A Data Driven Analysis
Robinhood launched over seven years ago as a stock prediction app, before it became the brokerage we have today.
450. Why Data Anomalies are More Important Than You Think
It is easy to be annoyed by strange anomalies when they are sighted within otherwise clean (or perhaps not-quite-so-clean) datasets. This annoyance is immediately followed by eagerness to filter them out and move on. Even though having clean, well-curated datasets is an important step in the process of creating robust models, one should resist the urge to purge all anomalies immediately — in doing so, there is a real risk of throwing away valuable insights that could lead to significant improvements in your models, products, or even business processes.
451. Why visualizations in Health don’t work
Visualizations in the most favorite health apps don’t have enough comparing and exploring possibilities.
452. Data Breaches: Why You Should Never Share Your Passwords
Data Breaches: Why You Should Never Share Your Passwords
453. Sustainable Computing beyond the Cloud
Extreme increases in data streams are expanding the cloud's carbon footprint; a sustainable alternative to Cloud dependence has been developed.
454. 3 Things I Learned Building My First Neural Network
I’ve been working with massive data sets for several years at companies like Facebook to analyze and address operational challenges, from inventory to customer lifetime value. But I hadn’t worked yet on something this ambitious.
455. The Importance of HIPAA Compliance to Protect Sensitive Data
HIPAA Compliance mainly aims to prevent any kind of misuse or illegal disclosure of protected health information (PHI).
456. 3 Important Integrations For Your Time Tracking Software
Effective time management is critical to success.
457. Future of Marketing: How Data Science Predicts Consumer Behavior
Gradually, as the post-pandemic phase arrived, one thing that helped marketers predict their consumer behavior was Data Science.
458. The Power of Using Data to Overcome Post-Pandemic Woes
Small to medium-sized businesses (SMBs) need more support from financial institutions.
459. Living in the world of AI - The Human Transformation
Today, if you stop and ask anyone working in a technology company, “What is the one thing that would help them change the world or make them grow faster than anyone else in their field?” The answer would be Data. Yes, data is everything. Because data can essentially change, cure, fix, and support just about any problem. Data is the truth behind everything from finding a cure for cancer to studying the shifting weather patterns.
460. Public Web Data for Business: Common Challenges And How to Solve Them
Businesses working with public web data experience various challenges. This article covers the most common ones and how to overcome them.
461. The 20 Slides That Raised $7 Million
Fundraising is a funny art.
462. What is Open Finance and How Will it Shape the Future of the Financial Industry?
The article covers the hottest topic in the financial Industry today: Open Finance.
463. IoT: Is Data Beyond Ethics?
Over the last decade, the Internet of Things has been delivering heaps of data and remote device control across virtually every industry, from healthcare to hospitality.
464. An Introduction to Data Automation for Business Efficiency
In today’s competitive business landscape, data automation has become necessary for business sustainability. Despite the necessity, it also comes with a few challenges--collecting, cleaning, andputting it together--to get meaningful insights.
465. 5 Most Important Tips Every Data Analyst Should Know
The 5 things every data analyst should know and why it is not Python, nor SQL
466. 14 Best Tableau Datasets for Practicing Data Visualization
This article focuses on the 14 Best Tableau Datasets for Practicing Data Visualization, which is essential for business analysts and data scientists.
467. Why Data Privacy is Important for Users in the Web3 Ecosystem
Interview discussing why data privacy is important for users in the web3 ecosystem
468. How to Democratize Access to Data Insights for Businesses of All Sizes
Messy government data has been part of the reason we've been unable to understand the COVID-19 pandemic. If federal organizations can't decode big data, what hope do small businesses have?
469. Downloading Data as a File with Alpine.js
A quick demonstration of using JavaScript to download ad hoc data.
470. How The Metaverse Relies on The Data Economy
The Metaverse isn't just built on the Data Economy, the Data Economy is the Metaverse.
471. 6 Tips for Tracking Software Licenses
If you are looking to manage your software licenses better, Read on to know how you can track and document software used in your company with ease.
472. The Newest Frontiers on the Modern Data Stack
Data-driven operations are the newest frontier of the Modern Data Stack with tools.
473. Unliked: Facebook’s Reign Could End
Facebook shares went tumbling following the news after a ‘sell’ recommendation from Michael Levine of Pivotal Research Group. Levine cited concerns over Facebook’s Ad revenue as well as ongoing regulatory risks.
474. Tips On Web Scraping to Find Amazon’s Bestselling Products
There’s no doubt that in order to make a decent profit on Amazon, it is essential to choose the best product to sell. To find out which product sells the best, we need to conduct product research to understand the market.
475. My Weird Career Transition From MBA to Data Science
Yes you read it correctly! I am calling my transition from being an MBA to being the Analytics Manager in a well known consumer retail brand a "WEIRD" one. And why do I say that? Because during my 5 year journey in data science, I have had the opportunity to work with a lot of business stakeholders like marketing head, brand managers, sales heads etc. and many a times they have asked me about my educational background. I would like to think that they asked this because of my ability to present the solutions keeping the business context and execution feasibility in mind. Well, the reason for asking this might be different for every individual, when I tell them that I am an MBA, their reply has always been the same, which is "What made you choose a technical career path after pursuing MBA?" And hence I decided to write this post to share my thoughts over 2 things:
476. How to Implement Digital Twin Architecture
What technologies are behind the digital twin and how to reasonably approach its creation? Discover a detailed explanation in this article. .
477. Life360 Potentially Leaves Its Users’ Sensitive Data at Risk
The family safety app Life360 doesn’t have some standard guardrails to prevent a hacker from taking over an account and accessing sensitive information.
478. Introduction to Redis: The In-memory Database
Redis is a type of database and it can be added to your production level application to make it more performant. I will cover the basics of Redis and show a real world example of Redis.
479. Can We Make Data Tidy?
The only way data specialists can facilitate analysis is by keeping data clean and organized.
480. "SQL Will Help You Achieve Your Goals" - Jarosław Błąd, Vertabelo CEO
We spoke to Jarosław Błąd, CEO at Vertabelo about how he set up his website based around teaching people how to code using SQL and the motivations behind it.
481. Why Businesses Need Data Governance
Governance is the Gordian Knot to all Your Business Problems.
482. 12 Best Pre-Installed R Datasets Commonly Used for Statistical Analysis
R programming is mostly used in statistical analysis and ML. This article looks at the Best Pre-Installed R Datasets Commonly Used for Statistical Analysis.
483. How Advanced Analytics Can Improve the Public Sector
Advanced analytic models can identify and predict negative outcomes such as health and safety challenges or compliance risks that would be overlooked by manual.
484. A Step-by-Step Guide to Failing a Data Science Project
As posited by Lev Tolstoy in his seminal work, Anna Karenina: “Happy families are all alike; every unhappy family is unhappy in its own way.” Likewise, all successful data science projects go through a very similar building process, while there are tons of different ways to fail a data science project. However, I’ve decided to prepare a detailed guide aimed at data scientists who want to make sure that their project will be a 100% disaster.
485. Five Recent FAAGM Statistics That Underline the Tech Industry's Longer Term Trends
30% fee Apple charges developers for App Store transactions
486. Spyse Introduction: Cybersecurity Search Engine for Data Gathering
Data gathering has always been a long process which required multiple services running simultaneously and spending hours scanning alone. With new services like the Spyse search engine, these processes have been simplified drastically.
487. A JavaScript Infographic: Data Science Salaries in 2022
Data visualisation infographic with insights on salary level of data scientists - how to create the JavaScript dashboard and analyse its data
488. Setting up Kafka on Docker for Local Development
In a world where data is king, Kafka is a valuable tool for developers and data engineers to learn.
489. How to Use Business Intelligence: 66% of Companies Want to Be More Data-Driven in 2021
How do BI solutions help to make the decision-making process driven by data, improve CX, and speed up reporting? And how can you implement it yourself?
490. The Truth About Less Biased Data-Informed Predictive Policing
Critics say it merely techwashes injustice
491. Insiders Breach Your Organization’s Data (Data Tells Us So)
Many company executives claim that the biggest threats to their data privacy are external threats, such as hackers or state-funded cyber-threats. However, companies are actually more likely to experience a data breach from an internal source, whether it is malicious or accidental.
492. Democratizing the New Data Economy in 2023: With Co-Founder of Seagate Technology, Finis Conner
Big tech firms get their wealth & power from the data they freely collect from users. But new blockchain tech is about to destabilize the entire data landscape.
493. A Guide on The Future of ETL: EL(T) not ELT
How we store and manage data has completely changed over the last decade. We moved from an ETL world to an ELT world, with companies like Fivetran pushing the trend. However, we don’t think it is going to stop there; ELT is a transition in our mind towards EL(T) (with EL decoupled from T). And to understand this, we need to discern the underlying reasons for this trend, as they might show what’s in store for the future.
494. Best Types of Data Visualization
Learning about best data visualisation tools may be the first step in utilising data analytics to your advantage and the benefit of your company
495. Indebted to the Tech
Having been in software long enough to work across industries and with a variety of distributed systems, I know every system has its day. Software and systems engineers create performance requirements, latency constraints, and sweat the details on software quality. Their hard work ensures retail sites stay online through black Friday, ad campaigns survive the Superbowl, accounting systems make it through tax season, and media sites stay up to date during the election. And that’s not even counting safety critical systems we rely on each day, systems that keep our hospitals, our vehicles, and our industries humming.
496. How to Install and Use Materialize to Run SQL Queries on your nginx Logs
In this tutorial, I will show you how Materialize works by using it to run SQL queries on continuously produced nginx logs. By the end of the tutorial, you will
497. Pickling and Unpickling in Python
In this blog, you will learn about the Pickling and Unpickling process, although it is quite simple it is very important and useful.
498. Data Mesh - A Contrarian View
You've heard of "Data Mesh" and want to know if it really is all that and a side of fries?
499. Applying Criminology Theories to Data Management: "The Broken Window Theory: and "The Perfect Storm"
What can be done to prevent “Broken Windows” in the primary data source? How can we effectively fix existing “Broken Windows"?
500. Cross-selling in eCommerce Matters: A Technical Guide for Upselling Online
Cross-selling and upselling are key areas to focus on with an eCommerce business, and this article will teach you how to implement upselling on your storefront.
501. 10 FinTech Trends in 2021 [Part II]
You can read the first part of this article here. For those who for some reason don’t like to follow the links, let me remind you briefly: in the first part, we made a retrospective of fintech trends in 2020 and delved into the first 5 trends in 2021.
502. You are NOT a Person, YOU are a Data Point: A Christmas Story
You see, on the Internet, you’re not a person. You’re a data point. A small player in a grand A/B testing experiment of an intricate user journey.
503. Life360 to Stop Selling Precise Location Data
Life360, a safety app, claims it will stop selling precise location data.
504. Best Courses and Certification Exams to Help You Become a Scrum Master
Scrum Alliance, scrum.org, and ICAgile are some of the best Scrum certification providers in the market. Certifications from a reputable source provide insight.
505. Collecting Data from 1.1M Hacker News Curated Comments
In this test we use the data collection of 1.1M Hacker News curated comments with numeric fields from https://zenodo.org/record/45901.
506. An Introduction to Automation in Vision AI
Levels of Annotation Automation
507. Think You Know Why Google Acquired Fitbit? Think Again!
There's more than meets the eye when it comes to Google's acquisition of Fitbit. Read on to learn more.
508. 4 Reasons Why CFOs Must Adopt Artificial Intelligence
In today’s fast-paced world, CFOs must be open to empowering their finance team with AI-based technologies for increasing efficiency and cost-effectiveness.
509. Is Your Data Biased? How To Overcome Survivorship Bias
In this post, we study the Survivorship bias — the danger to concentrate your data analysis solely on existing power users
510. "It's Easy to Start but Very Hard to Finish Strong" - Marketing Data Founders
Amy Tom chats to Gil Allouche (CEO of metadata.io) and Tom Coburn (CEO of Jebbit) about starting a marketing data company. They chat about how they each started
511. A Brief Introduction Into A Typical Data Science Project Life Cycle
In this post, I demystified data science and talked about the lifecycle of a typical data science project. It's a good read for everyone.
512. Your Data is the DNA of Artificial Intelligence
As society becomes increasingly AI-driven, the essential raw material to create artificial intelligence is your data.
513. Perfect the Quality of Your Imperfect Data
Poor quality data can significantly drop the ROI of a company’s CRM and marketing automation investment.
514. Creating A Data Science Pipeline That Works Correctly
An easy, automated, repeatable way to check your data science solution is doing exactly what it's designed to do.
515. Everyone Works for Facebook and Google
Imagine a car factory where nameless workers in a sprawling complex make expensive cars all day and all night. Thousands of shiny, new expensive cars move off the line every millisecond and are shipped instantly all over the world to wealthy buyers, generating $195 million of profit daily for the car company.
516. Adopt the Automation Route to Scale Up Your Business
Machine Learning is advancing steadily, enabling computers to understand natural language patterns and think somewhat like humans. The advances in Artificial Intelligence (AI) are increasing the prospects of businesses to automate tasks. With automation, you can save time and bring in more productivity for your business.
517. The Ethics Behind Data Collection and Privacy [Infographic]
A look at the future of data collection and online privacy through cookieless tracking.
518. Analyzing Data From U.S. Road Accidents With Data Visualization
In this article, we would be analyzing data related to US road accidents, which can be utilized to study accident-prone locations and influential factors.
519. Introduction 5 Different Types of Text Annotation in NLP
Natural language processing (NLP) is one of the biggest fields of AI development. Numerous NLP solutions like chatbots, automatic speech recognition, and sentiment analysis programs can improve efficiency and productivity in various businesses around the world.
520. The Difference Between Privacy Talkers and Privacy Doers
I introduce the concept of storing and processing data focussing primarily on user privacy in my book, “Data is Like a Plate of Hummus”. I know many of you have read it – perhaps you're even thinking about it now ahead of the upcoming changes to Apple’s privacy settings which will block the attribution of users without consent.
521. Smart Cities Raise Data Privacy Concerns
Should you be excited about smart cities or concerned about your privacy and data? I go through three of the ten privacy principles (PIPEDA) and their effects.
522. Build a Live Dashboard with Materialize, Airbyte, MySQL and Redpanda/Kafka
523. Crowd learning and Crowdsourcing: Trends in Contemporary Journalism
In the last few years, technological advancements have reshaped the way every individual, society, and industry functions. Journalism is no different.
524. Azure Data Factory: An Amazing Data Migration Tool
This blog will highlight how users can define pipelines to migrate the unstructured data from different data stores to structured data via Azure Data Factory
525. 10 Ways to Reduce Data Loss and Potential Downtime Of Your Database
In this article, you can find ten actionable methods to protect your mission-critical database.
526. Data-Driven Approach for Software Engineering: How to Avoid Common Problems
In today’s digital world, data is constantly being generated, evaluated, and updated. It also plays an important role in the work of software engineers by providing accurate, actionable feedback that helps engineers understand where and how to make improvements to a product or process.
527. HN Editor Picks: Top Tech Stories of March 2023
Take a look at all the best HackerNoon stories, handpicked for your reading pleasure and education on trending tech topics.
528. Automation for Girl Scout Events
The shutdowns brought an opportunity for my daughter to participate in virtual scouting events all over the United States. When the event registration form changed, I took the chance to try out some new web scraping skills while inspiring my daughter about the power of code for everyday tasks.
529. We Built a Modern Data Stack for Startups
Here's how we built our data stack at incident.io. If you're a company that cares about data access for all, follow this guide and we guarantee great results.
530. Digital Transformation Strategy: Dinosaurs, Harpoons, Greek Myths and YOU!
Digital transformation is not one single thing to implement, it is a core alignment with continuous investment in innovation and excellence.
531. Open-Source Data Integration and ETL in 2020
Open-source data integration is not new. It started 16 years ago with Talend. But since then, the whole industry has changed. The likes of Snowflake, BigQuery, and Redshift have changed how data is being hosted, managed, and accessed, while making it easier and a lot cheaper. But the data integration industry has evolved as well.
532. Nevermined: How organizations can manage or monetize their data with a next level solution
This post provides a short technical overview of Nevermined’s capabilities
533. From Farm To Data - A Tech Career in Product Marketing (Podcast Transcript)
Amy Tom talks to Matt Groves, Senior Product Marketing Manager at Couchbase, and Rob Hedgpeth, Developer Advocate at MariaDB about their careers.
534. How Technological Progress Takes Affiliate Marketing to the Next Level
The world around us is changing at a rate that’s often incomprehensible to us. We’re seeing new inventions everywhere we look, and it’s happening every single day. 5G, AI, VR, satellite Wi-Fi - these and other inventions are constantly hitting our cognitive system, making it overwhelmed.
535. My Bot Helps You Trade Data - Introducing ARBot
Data is becoming increasingly recognised as an asset of value. So much so, in fact, that data marketplaces have opened up, establishing an emerging data economy. This has opened up a wealth of profit-making opportunities that most people are still unaware of. Having worked closely with leading data marketplaces for over a year, I decided to try my hand at something new: arbitrage with data as an asset.
536. 5 Reasons Why the Blockchain is NOT A Good Fit for Your Business
Alexandr Kurbatov, EnCata Soft CBDO, tells why businesses should abandon the introduction of popular technology and cases when it still needed.
537. As Internet Usage Spikes During COVID-19 Pandemic: How Are ISPs Holding Up?
As communities worldwide grapple with the reality of an extended COVID-19 induced lock-down, Internet usage has, understandably, significantly increased. As more and more countries are forced into lock-down in an effort to curb the spread of the deadly virus, the growing number of people forced into working remotely and finding online entertainment is seeing the Internet absolutely explode. In fact COVID-19 has pushed up Internet use by a whopping 70 percent in some countries while streaming services are up by more than 12 percent, figures last month revealed.
538. DeFi Meets NFT With $MEGA Yield Farming in The MCP3D Decentralized City
Recent months demonstrated explosive growth of Decentralized Finance with $13B+ in total value locked. Normally, games are the first thing to take off on new platforms, and seems like DeFi is not an exception here.
539. Data Lakes Are Crucial To Business Analytics and Big Data Processing
While the term Data is in cognizance of business of all sizes even the most layman person is aware of the buzz and fuss around Data. So from Database to Data Warehouse and now this Data Lake, we have come a long way.
540. Digital Consumers: It's Time to Grow Up
The recent release of Netflix's film “The Social Dilemma” has boosted existing questions and fears looming among consumers regarding their privacy on social media platforms. Putting aside the behavioral effects of social media, one must wonder why the modern consumer has become so critical and scared of the data-gathering and targeted ads behind social media when they have made our lives so much better.
541. Improving Our MongoDB Write Throughput with SQS
Deep dive on how we got our MongoDB load at a steady sub 50% CPU load by using an SQS layer between our Node application and the database to save costs.
542. Numbers Don't Lie But They Are Easily Misinterpreted All The Time
therefore a working compromise between these 2 extremes should be found on a case-by-case basis
543. U.S. Citizens' Data Protection: Why DHS Work Is Important
★In this article we will learn about PII should be Identified, collected, analyzed, used, and PII/SPII handled on different projects of the Department of Homeland Security (DHS).
544. How to Generate Synthetic Data?
A synthetic data generation dedicated repository. This is a sentence that is getting too common, but it’s still true and reflects the market's trend, Data is the new oil. Some of the biggest players in the market already have the strongest hold on that currency.
545. How to Create A Funnel Chart In R
Funnel Chart in R, A funnel chart is mainly used for demonstrates the flow of users through a business or sales process.
546. Big Data Analysis on Blockchain with CEO of Covalent, Ganesh Swami
I sat down with Ganesh Swami, co-founder and CEO at Covalent, a Blockchain Big Data analytics firm, to discuss the Ethereum ecosystem.
547. Linear Regression vs. Logistic Regression for Classification Tasks
This article explains why logistic regression performs better than linear regression for classification problems, and 2 reasons why linear regression is not suitable:
548. How We Collaborated with Meta (Facebook) to Create Shadow Cache
Shadow cache is deployed in Meta (Facebook) Presto and is being leveraged to understand the system bottleneck and help with routing design decisions.
549. Top 3 Advantages of Video Annotation
The 3 Major Advantages of Annotating Video with the Innotescus Video Annotation Canvas.
550. What are the Best Options to Store Data and Keep it Safe Forever?
This article will go over the most effective, economical and long-lasting methods for storing our data.
551. A Tale of Two Pitch Decks
You arrive at a fork in the road. Rain pours down, with no end in sight. You are running out of resources, namely money, from which most resources are derived, including food and basic shelter. You are unable to make money out of the surrounding forest. Everything is very wet, you don’t have a saw, or a printer, or green ink. You have just listed three things you think go into the formation of paper currency, but you are generally unsure. You have now wasted an additional sixty seconds, for which you will never get back. Your inventory consists of:1. A business idea that you think is pretty good.2. A mobile app for your business that mostly works3. A pitch deck that you believe represents your true mission (but also says “fuck” on the first slide)4. A pitch deck that is for a similar but different business. (A positioning more enticing to investors.)5. Enough money to last you approximately two to three months 6. A 1952 Topps Mickey Mantle rookie cardWhat do you do?> Try to sell Mickey Mantle cardYou take a risk and leave the fork in the road in search of the nearest collectible merchandise shop. During the treacherous journey, you fail to answer a troll’s riddle and are forced to give up more money. You finally arrive at a collectibles shop. Your hope meter increases by 20%. The Mickey Mantle card is valued at approximately $5 million USD. Your reach into your jeans pocket. You present the store clerk with a soggy ball of cardboard which drips yellow and blue ink all over the counter. I told you the pouring rain showed no signs of stopping. The clerk feels bad for you and is willing to give you $5 USD. You accept.Now what?> Extract ink from counter.
552. You Are the Cure to Imposter Syndrome in Data Science
Imposter syndrome is a common experience for data scientists. But there are ways to tackle it and succeed despite it.
553. How to Make Your Own and Free Backup Application
In our age of rapidly developing technologies, data loss can be a disaster not only for large corporations, but also for the average user, showcasing the immense importance of backup and data recovery in today’s data driven world.
554. Automated Data Catalogs will Help Manage Data in 2022
Data is increasingly playing a dominant role in business. Know how automating your data catalog can help with efficient data management in 2022.
555. Storing data with Vinyl
This article describes how the developers of the in-memory computing platform Tarantool implemented disk storage.
556. Machine Learning Trends Businesses Should Know In 2020
Have you ever considered how much data exists in our world? Data growth has been immense since the creation of the Internet and has only accelerated in the last two decades. Today the Internet hosts an estimated 2 billion websites for 4.2 billion active users.
557. WhatsApp Users Hit 2 Billion: What Does This Mean for the Future of Privacy?
There are now over 2 billion registered users on the mobile messaging platform, up from 1.5 billion in 2017.
558. In 2019, Securing Data Is No Easy Task. Clickjacking- A Case Study
This article is about my journey to understand the current practice of de-anonymization via the clickjacking technique whereby a malicious website is able to uncover the identity of a visitor, including his full name and possibly other personal information. I don’t present any new information here that isn’t already publicly available, but I do look at how easy it is to compromise a visitor’s privacy and reveal his identity, even when he adheres to security best practices and uses an up-to-date browser and operating system.
559. It's Time to Normalize Speaking Out About Internet Safety
GIVE Nation marks Safer Internet Day with a conversation around privacy, particularly for parents.
560. Save and Search Through Your Slack Channel History on a Free Slack Plan
Sometimes, we might not be able to afford a paid subscription on Slack. Here's a tutorial on how you can save and search through your Slack history for free.
561. Machine Learning in Cybersecurity: 5 Real-Life Examples
From real-time cybercrime mapping to penetration testing, machine learning has become a crucial part of cybersecurity. Here's how.
562. 10 Best African Language Datasets for Data Science Projects
A list of African language datasets from across the web that can be used in numerous NLP tasks.
563. Are freelance developers different?
Rise of the contract coder
564. How Facebook Makes Money and Why You Should Worry
Facebook sells ads, as Mark Zuckerberg famously and patiently explained to Congress, but it’s a little more complicated than that.
565. Data Fingerprinting in JavaScript
I want to talk a little about how you can use content-based addressing (aka data fingerprinting) as a general approach to make your applications faster and more secure with some practical JavaScript examples.
566. A Guide to Authoring Power BI Reports on Real-Time Google Sheets Data
CData Power BI Connectors provide self-service integration with Microsoft Power BI. The CData Power BI Connector for Google Sheets links your Power BI reports to real-time Google Sheets data. You can monitor Google Sheets data through dashboards and ensure that your analysis reflects Google Sheets data in real-time by scheduling refreshes or refreshing on demand. This article details how to use the Power BI Connector to create real-time visualizations of Google Sheets data in Microsoft Power BI Desktop.
567. Raising Funds as a Blockchain Startup: A KYVE Interview
This article talks about how to raise funding as a blockchain startup and decentralized storage systems.
568. The Beginner's Guide to The Google HEART Framework
In this post, we will dig into the Google HEART framework: a simple way to ensure you take into consideration every aspect of the user journey.
569. Hadoop for Hoops: Explore the Whole Ecosystem and to Know How It Really Works
Technological evolution has changed the landscape, everything which we feel and hear today is revolving around some of the modern technology. This technology involves Artificial Intelligence, big data, cloud computing, data science, and much more, which has changed the landscape to a great extent. To integrate this technology, many of the IT professionals are finding and implementing the trajectory of today's modern technologies.
570. Understanding the tech behind Snowflake’s IPO and what’s to come
By now you must have read quite a few articles about Snowflake’s absolutely mind-blowing and record-setting IPO. This article is not intended to speculate on whether the valuation makes sense or not, but rather help you understand the technological concepts that make Snowflake so unique, and why it has proven to be so disruptful for the data space in general and the data warehousing space in particular.
571. What Is Big Data? Understanding The Business Use of Big Data Analytics
Big data analytics can be applied for all and any business to boost their revenue and conversions and identify their common mistakes.
572. Your Website Knows Where Your Users Are—But Is It Keeping That Data Secret?
It’s one thing to share user geolocation data deliberately without consent, but what if you’re inadvertently giving it away?
573. 5 Features To Consider When Looking For a Reliable Data Loss Prevention (DLP) Software To Buy
As data loss prevention (DLP) solution plays a crucial role to prevent unauthorized access to an organization’s sensitive data.
574. How to Build a Data Stack from Scratch
Overview of the modern data stack after interview 200+ data leaders. Decision Matrix for Benchmark (DW, ETL, Governance, Visualisation, Documentation, etc)
575. On Yang's Data Dividend Project and the Push to Treat Data as Property
576. What Are Conflict-free Replicated Data Types (CRDTs)?
In a world where most of the apps that we use on the internet are collaborative in nature, conflicts in data are common. Is there a way to avoid it?
577. Does Tech help you make Smarter Decisions? Not in the way you think
Does Tech help you make Smarter Decisions? Not in the way you thinkPhoto by Engin Akyurt from Pexels
578. We Need To Build A Human-Centered Future Because Regulations Alone Aren't Enough
Emerging data privacy regulations, such as the General Data Protection Regulation (GDPR) in Europe and California Consumer Privacy Act (CCPA) in the USA, have sparked global discussion around:
579. In a Time of Crisis, Data Must Be Able to Defend Itself
From hijacked routers to an attempted hack on the World Health Organization, our time of crisis shows that hackers are opportunists to the core. Health records, social security numbers, IP … everything is fair game, nothing sacred or immune. At least in the current online infrastructure. On a long enough timeline, the probability of a hack nears 100%.
580. Twitter Sentiment Analysis for the 2019 Lok Sabha Elections
Introduction
581. Data Services for the Masses
I’ve held several roles in my career in IT, ranging from software developer to enterprise architect to developer advocate. I’ve always been fascinated by the role that data plays in our applications—putting it into databases, getting it back out quickly, making sure it remains accurate when transferred between systems. Many of the hardest problems I’ve encountered have centered around data. For example:
582. Data Playgrounds are The Cure for Slow and Inefficient DataOps
Companies struggle with their DataOps due to a flawed, code-centric, and linear workflow. To succeed, they must build data playgrounds, not mere pipelines.
583. A Guide to Scraping HTML Tables with Pandas and BeautifulSoup
How to not get stuck when collecting tabular data from the internet.
584. Data Persistent Prometheus-Grafana Intergration with Jenkins
Prometheus is an open-source application monitoring and alerting software solution. It is a web application which can be deployed anywhere — in a PC, virtual machine, or even in a container. It scrapes data from the exporters (small programs convert system data to Prometheus metrics) periodically and records the real-time metrics in a time series database.
585. 9 Best Data Integration Software in 2022
Every business needs to collect, manage, integrate, and analyze data collected from various sources. Data integration software can help!
586. Holy Land of Crypto Users: How does Web3.0 Data Empower Centralized Exchanges?
Designing a data-oriented, user-incentive mechanism is a good path when developing the future of centralised exchanges for the cryptocurrency industry.
587. What You Should Know About Zero-Party Data
Zero-party data (ZPD) means a company only collects user data that is freely given. Period. But why would a modern business, raised on the wonders of Big Data, undertake such a foolish philosophy? Maybe because they aren’t fans of financial ruin.
588. Can Technology Save the Oil and Gas Industry?
Digital advances are here to aid the Oil and Gas companies in transcending the limitations of traditional techniques.
589. Open Source is the Only Way to Address the Long Tail of Integrations
Wouldn’t it be great to bring the time needed to build a new data integration connector down to 10 minutes? This would definitely help address the long tail of
590. Top 7 Announcements from Data and AI Summit 2021
7 Highlights from the Data and AI Summit (former Apache Spark Summit) for the Busy IT Professional
591. So Safe You'll Cloud 9
Cloud migration means to move business data, applications, or other critical business services from on-premise data centers or onsite computers to the cl
592. 20 Herramientas de Inteligencia Empresarial (BI) más Populares en 2020
Business Intelligence (BI) es un negocio basado en datos, un proceso de toma de decisiones basado en datos recopilados. A menudo es utilizado por gerentes y ejecutivos para generar ideas procesables. Como resultado, BI siempre se conoce indistintamente como "Business Analytics" o "Data Analytics".
593. How to Create a Responsive Table with HTMX and Django
A guide on how to create a responsive table inside your web applications using both Django and htmx to create such a system to process your website's data.
594. How To Do Data Mapping in Kumologica
Data mapping is a key element in integration. Most of the prominent integration tools provide different capabilities for data mapping. In this article I thought of sharing on how data mapping can be achieved in Kumologica. Kumologica uses JSONata as the base for data mapping. JSONata is a Lightweight query and transformation language for JSON data. It supports complex queries expression which can be achieved with minimal syntax and has a location path semantics of Xpath 3.1.
595. Can a Data Scientist Drown a City in 3 Feet of Water?
Yes, Let’s dive into the details.
596. Predict Customer Churn With Machine Learning, Data Science and Survival Analysis
Predicting customer churn is very important because businesses have limited resources and cannot afford to lose customers if they want to stay profitable.
597. What is Quantum Key Distribution Method?
We have long sought secure ways to exchange data. Some current methods include cryptography, hashing and requiring the solution of math problems that demand enormous computing power. Quantum computing could render some of our current methods insecure and obsolete, while enabling new methods.
598. Scraping with Selenium 101: The Big Hole on Data Scientists Toolset [Part 1]
Usually forgotten in all Data Science masters and courses, Web Scraping is, in my honest opinion a basic tool in the Data Scientist toolset, as is the tool for getting and therefore using external data from your organization when public databases are not available.
599. How To Meaningfully Interpret COVID-19 Data
600. Exposing Mediocre Teachers at My School using Data - Here's My Failed Attempt
A programmer’s story
601. Planning for Your Startup: The Data Team's Guide to 2021
Planning in a startup can feel like an exercise in futility — especially when it comes to data — especially when your data team is small and scrappy.
602. How Important is the API Economy for Blockchain Application Development?
A blockchain cannot take care of all the information it handles. It should focus on its core capability blockchain and not about providing different data options.
603. How LTO Network Tackles Existing Problems With Blockchain Technology
Today, businesses rely heavily on data. Not to mention, advances in information technologies present a challenge to businesses to constantly be on top of the growth curve. Legacy systems are quickly becoming more inefficient when compared side by side with newer innovations. Furthermore, the exponential speed of change we have today creates more intelligent security threats which can cause a significant amount of money to rectify. Therefore, if your business is getting smarter, why shouldn’t your business processes smarten up as well? Enter LTO Network.
604. How To Add Data Sensitivity Classification Command in SQL Server 2019
For a database administrator, the common everyday practice involves running multiple operations targeted at ensuring database security and integrity. Thus, we shouldn’t overlook the importance of sensitive data stored in the database under any circumstances. In light of this, we are excited to demonstrate the new ADD SENSITIVITY CLASSIFICATION command introduced in SQL Server 2019, which allows adding the sensitivity classification metadata to database columns.
605. How To Segment Shopify Customer Base with Google Sheets and Google Data Studio
After defining what the RFM analysis is standing for, and how you can apply it to your Customer Base, I want to show you how to apply it on Shopify orders data.
606. Data-Driven Advertising and Its Impact On Our Privacy-Driven World
Do we actually need so much data to do effective marketing?
607. Data Access for Microservices
If you want to access data in a distributed environment such as in a microservice architecture, then data services are the way to go. The idea is to create a data abstraction layer (DAL) that the rest of the system’s applications and services can share. Thus, a data service gives you a generalized interface to the data you’re exposing and provides access to it in a standard manner. This would be in a well-understood protocol and a known data format. For example, a popular approach is to use JSON via HTTP/S.
608. Business Intelligence in microservices: improving performance
Do you know why microservice design is so popular within the development of BI tools? The answer is clear: it helps to develop scalable and flexible solutions. But microservice architecture has a great drawback. Its performance usually requires great improvements.
609. Build vs Buy: What We Learned by Implementing a Data Catalog
Why we chose to finally buy a unified data workspace (Atlan), after spending 1.5 years building our own internal solution with Amundsen and Atlas
610. Data Privacy is Becoming More Important for Users in 2022
A look at how data privacy is becoming more important for users in 2022
611. Data Lineage is Like Untangling a Ball of Yarn
Data lineage is a technology that retraces the relationships between data assets. 'Data lineage is like a family tree but for data'
612. How Big Data Can Bring Transformative Improvements to Medical Care
In the healthcare landscape, providers and lawmakers alike are faced with the challenge of making the best possible decisions for patients and the industry as a whole. From choosing the best treatments to using resources in a responsible manner, medical leaders are making decisions on a daily basis that can significantly impact health outcomes and costs.
613. Data Analytics is a Journey
It is 2020 and the data analytics has gained so much attention even outside of the tech community. "Data is gold", they say - no one wants to be left behind. However, getting the right strategy is neither a straightforward nor static process.
614. The People and Tech Behind Data Science
What is a data scientist? The job has been around for hundreds of years, though as you may suspect things have changed significantly, especially over the last century. In the 1740s Bayes’ Theorem posited that when new data was added to an existing belief, the result was a new and improved belief. This is the basis for the scientific method, by which scientists discover better and better explanations for things. When applied to data, the scientific method creates data science, in which data scientists can use the piles of data people are generating to discover new and better predictions about the future.
615. What is Web Data Collection?
Everything you need to know to automate, optimize and streamline the data collection process in your organization!
616. 5 Simple Ways to Kickstart Your Freelance Data Science Career
If you’ve been itching to get your feet wet in the field, these steps will provide you with lots of valuable ideas and suggestions to kickstart your career.
617. Currency-As-A-Model for Reframing The Debate on Data Privacy: A Thought Experiment
Using Currency as a Model for Reframing The Debate on Data Privacy. A thought experiment.
618. 3 Best Data Recovery Tools for Windows and Mac
Going through a hard drive crash and having to start your data recovery efforts all over again from scratch can be frustrating and time-consuming.
619. This Data Set Shows How Our Emotions Affect the Weather
An interesting observation on the emotional effect of weather. Correlation does not equate causation but the situation remains fascinating to readers anyway.
620. CS Data Structures: Fixed Array
A fixed array is an array that has a max amount of items. Such arrays are used when the programmer knows how many elements an array should hold.
621. We Kinda Bypassed Firebase's Paywall: Here's How
Some time ago, a few friends and I decided to build an app. We duck-taped our code together, launched our first version, then attracted a few users with a small marketing budget.
622. How to Implement Heap in Data Structure
Heap data structure is a balanced binary tree data structure where the child node is placed in comparison to the root node and then arranged accordingly.
623. Statistics Cheat Sheet: A Beginner's Guide to Probability and Random Events
A beginner’s guide to Probability and Random Events. Understand the key statistics concepts and areas to focus on to ace your next data science interview.
624. Understand Data Analytics Framework Using An Example From General Electric Company
The framework will allow you to focus on the business outcomes first and the actions and decisions that enable the outcomes.
625. Querying Data With GraphQL & Ballerina
Take look at the basics of GraphQL and how it is supported out-of-the-box with the Ballerina programming language.
626. How to Scrape NLP Datasets From Youtube
Too lazy to scrape nlp data yourself? In this post, I’ll show you a quick way to scrape NLP datasets using Youtube and Python.
627. #CorrelateThis - The Neural Activity of Mice Indicates Cryptocurrency Price Fluctuations
In his research, Gvido reported the discovery of neurons that showed a neural correlation to the price fluctuations of the main cryptocurrencies!
628. How to Use Different Data Visualizations in the Grafana Dashboard
In this post, we will see how to use different visualizations, like the simple graph, pie chart, world map panel in the grafana dashboard by writing queries in Influx query language
629. Delta Compression: Diff Algorithms And Delta File Formats [Practical Guide]
A diff algorithm outputs the set of differences between two inputs. These algorithms are the basis of a number of commonly used developer tools. Yet understanding the inner workings of diff algorithms is rarely necessary to use said tools.
630. Make Software Great Again: Can Open Source be Ethical and Fair?
Is there a way to go beyond open source, and have ethical, fair software in a cloud-first world? This is what some people in the open source community think.
631. Employee Training: How to Make Data-Driven Business Decisions
According to PwC research, highly data-driven organizations are three times more likely to witness considerable improvement in decision-making. Unfortunately, a whopping 62% of executives still rely more on experience and gut feelings than data to make business decisions.
632. What The Shell?! (Podcast Transcript)
Amy Tom chats with Michael Nitschinger, a Software Architect at Couchbase, and Jonathan D. Turner (AKA JT), the Co-Creator of NuShell, about Shell scripting.
633. Developing, Packaging and Distributing a Python Library
How to use new packaging standards with virtual environment tools — adapted from the official documentations of python.org and Pipenv
634. A Detailed Guide To Using Apache Storm
Continuous streams of data are ubiquitous and becoming even more so with the increasing number of IoT devices being used. Of course this data is stored, processed and analyzed to provide predictive, actionable results. But petabytes take long to analyze, even with Hadoop (as good as MapReduce may be) or Spark (a remedy to the limitations of MapReduce).
635. How to Use Macro Data Points to Understand SMB Financial Performance
A constant and long-standing pain-point for SMBs has been the difficulty they face in securing financing compared to their larger peers. This has been due in large part to the traditional banks considering SMBs to be both high-risk and high cost to underwrite, onboard, and serve.
636. Heatmaps 101: What are Heatmaps and How to Use Them in your Business
Data has changed our lives drastically. So much so, that there are many debates and discussions on how data has surpassed oil as a commodity. Although you can’t really compare the two, data is a valuable resource, and data is to the digital economy what oil is to the industrial economy.
637. 10 Ways to Optimize Your Database
Take these 10 steps to optimize your database.
638. Real-Time Data Processing for Analytical Use Cases: Is it Worth it?
How can you reap the benefits of a real-time processing with the least amount of architectural changes and maintenance effort?
639. 5 Data Management Principles That Matter in 2021
Let’s consider a few fundamental data management principles that matter. Data management is less about filing information and more about finding order.
640. A Beginner's Guide to Data Structures and Algorithms
Data structures and algorithms allows you to write better code, solve complex problems, and understand the inner workings of computer programs.
641. The Impact of Big Data In Business, Past and Future
Big Data: full-size disruptive
642. How LTO Network helps Businesses Utilize Blockchain Technology
The brainchild of Satoshi Nakamoto, Bitcoin is the face of Blockchain Technology. It was devised as a digital currency as an answer to current financial instruments such as the fiat i.e USD, Euros, and many others. Bitcoin aside, the tech-world has found the underlying technology to be a possible solution to various problems in businesses and legacy systems.
643. Is Data Privacy Different in The 21st Century?
The digital age brought numerous conveniences that are appreciated to date; however, other concerns arose as the dark side of technology started to show itself.
644. How to Forecast Purchase Orders for Shopify Stores Using Open-Source
Use the open-source integrated machine learning in MindsDB and the open-source data integration platform Airbyte to forecast Shopify store metrics.
645. How the Ancient Egyptians Built the Original Skyscrapers with Data
I originally published this story for the Atlan Humans of Data publication.
646. Data Science Skills Matrix: Why Critical Thinking is Most In-Demand
Whether you’re brand new to data science, have gotten your feet wet in this field or are an expert, you should know that working with data is all about generating knowledge.
647. Synthetic Data’s Role in the Future of AI
Thanks to advanced data generation techniques, synthetic data can replicate real-world scenarios with high levels of accuracy.
648. Data Privacy: Why The Existing Architecture is in Dire Need of Evolution
Data is to the 21st century what oil was for the 20th century. The importance of data in the 21st century is conspicuous. Data is behind the exponential growth witnessed in the digital age. Increased access to data, through the internet and other technologies, has made the world a global village.
649. The Question Of Control Over Data in Shared Databases
Data is one of the most important tools of our time. Its usage is now pervasive throughout all aspects of society: from air transport to banking, construction to dentistry, from education to farming and beyond.
650. Which Database Is Right For You?Graph Database vs. Relational Database
Learn about the main differences between graph and relational databases. What kind of use-cases are best suited for each type, their strengths, and weaknesses.
651. Syncing Data from Coda to Google Sheets And Vice Versa with Google Apps Script [A How-To Guide]
Last year I published a tutorial on how to sync data between two Coda docs and data between two Google Sheets. What was missing from the tutorial was how to sync data between a Coda doc and a Google Sheet.
652. Get Started With Big Data Analytics For Your Business.
Everything we do generates Data, therefore we are Data Agents. The question is: how we can benefit from this huge amount of data generated every day?.
653. Setting up Continuous PostgreSQL Backups
This manual describes the process of setting up continuous backups for PostgreSQL databases to safeguard your data from accidental loss in an efficient way.
654. Data Gathering Methods: How to Crawl, Scrape, and Parse Data Online
The internet is a treasure trove of valuable information. Read this article to find out how web crawling, scraping, and parsing can help you.
655. Introducing GitHub Based Airports API Service
Hello developers and enthusiasts! 😍
656. Putting Value back in the Data Economy with Pool Data CEO Shiv Malik
In this slogging AMA, we host the CEO of Pool Data, Shiv Malik. Shiv walks us through Pool Data and how it supports data unions.
657. The 2021 AI Rewind: HackerNoon Edition
A curated list of the latest breakthroughs in AI and Data Science by release date with a clear video explanation
658. AI Will Reshape the Cybersecurity World in 2021
Cybersecurity providers will step up AI development to merge human and machine understanding to outpace cybercriminals' goal of staging an arms race.
659. The Importance of a Single Source of Truth for Enterprises
A single source of truth (SSOT) enables that synchronization. A company with SSOT relies on one and only one point of reference for the latest, aggregated info.
660. Big Tech Is Acquiring Access to Your Health & Home
SMART HOMES: THE FINAL FRONTIER
661. 5 Life-Saving Tips About Cyber Security
Introduction:
662. Keep Sharing Context: How to Enable Better and Faster Product Decisions
“So, what do you think?” — says the Product Manager after a product strategy presentation to his team
663. Revolutionizing the Value of Data with cheqd
In this Slogging AMA, the team at Cheqd joined us to explain why and how their platform enables the average user and business to take control of their data.
664. What Apple And Spotify Know About Me
Unsurprisingly, the data that our apps have collected about us is both impressive and concerning, though it can be very interesting to review and explore it.
665. Identifying The Poor in India: A Data Driven Analysis
Ever since I quit the corporate world, the story I have been telling myself is that I want to work on uplifting the poorest. It sounds romantic at the onset but like most things, is a lot more complicated when you get down into the weeds.
666. Data Can Help You: How Technologies Fight Mental Health Issues
Medical technologies are not limited to remote examinations, robotic surgical controllers and diagnostic algorithms. Today they transform mental health domain, specifically, work methods with patients and the doctor’s role.
667. An Introductory Guide to Variables and Data Types in Go
Hello there! So today we would be learning about Go variables and the different data types associated with Go.
668. How to Choose the Right Hyper-V Backup Strategy
This post discusses the main data protection strategies that can help you keep your Hyper-V data secure at all times.
669. Why and How We Used Singer to Bootstrap Our MVP
One of the (many) hard things about doing a startup is figuring out what that MVP should be. You are trading off between presenting something that is “good” enough that it gets people excited to use (or invest in) you and getting something done fast. In this article, we explore how we wrestled with this trade-off. Specifically, we explore our decisions around how to use Singer to bootstrap our MVP. It is something we get tons of questions about, and it was hard for us to figure out ourselves!
670. How Latin American Startups Can Make Their Own User Data Their Growth Superpower
Investors are increasingly cautious when contributing to the growth of startups through personalized support. So, what can founders do about it?
671. Ransomware: AIDS, Scientists, and a Floppy Disk
Global technological trends are pushing scammers to create more inventive ways to pay the ransom.
672. How and Why We Choose to Clone all Data on Github
Why would anyone choose to clone and continuously maintain a perfect clone of all data on Github? Debricked has the answer.
673. Top Data Analyst Skills in 2021
Enhance your knowledge and skills in the field of data analytics with the help of data science certification for a rewarding career as a data analyst.
674. It’s in the Data: How COVID-19 is Affecting the Digital Landscape
I’m sure almost everyone reading this has been affected by the emergence of the novel coronavirus disease (COVID-19), in addition to noticing some serious disruptive economic changes across most industries. Our data research department here at Oxylabs has confirmed these movements, especially in the e-commerce, human resources (HR), travel, accommodation and cybersecurity segments.
675. Equifax will pay up to $700 million over one of the worst breaches in U.S History!
I still remember that day like yesterday.
676. How is Web Crawling Used in Data Science
No-Code tools for collecting data for your Data Science project
677. How to Find Market Fit for Data Products
By the time I entered the bar on that rainy spring afternoon, Justin had already started on his cocktail. It had been a few months since I saw him last; after his product design firm ended their work with my previous healthcare technology employer, he had taken on some new projects and it was tough to find time to connect. I had recently left that employer myself to take on a new job that ticked all the boxes- pay raise, prestigious company, work from home, great boss. Plenty of changes to catch up on.
678. If You’re Trying to Talk to Everybody, You’re Not Reaching Anybody
2019 Tech Trends for Marketers: How finely tuned is your targeting?
679. Data Modeling in Salesforce and Heroku Data Services
This is the third article documenting what I’ve learned from a series of 10 Trailhead Live video sessions on Modern App Development on Salesforce and Heroku.
680. Go vs Rust: A Sto-array of Arrays
Want to see disappearing data in Go caused by an innocent append of data to an array? Can this happen in Rust, too? Check this data-driven horror story out!
681. 5 Steps to Master Customer Intelligence in 2021
Customer intelligence is the process of gathering and analyzing customer data.
682. Why Data Quality is Key to Successful ML Ops
In this first post in our 2-part ML Ops series, we are going to look at ML Ops and highlight how and why data quality is key to ML Ops workflows.
683. Event-Driven Change Data Capture: Introduction, Use Cases, and Tools
How to detect, capture, and propagate changes in source databases to target systems in a real-time, event-driven manner with Change Data Capture (CDC).
684. The Effectiveness of AI and ML on Supply Chains Amidst a Global Pandemic
Covid-19 's impact on the supply chain industry has been very predominant. How to mitigate the situation by making the best of different optimization.
685. How To Avoid Manipulating Data Subconsciously: A P-Hacking Story
P value is the probability that the results we are seeing are real and not by random chance. P-Hacking is a term used to describe the scientific manipulation of data to get the desired P value. All of us do this with our experiments, consciously or not.
686. Ultimate Guide to Synthetic Monitoring Products
As we look forward to 2021, Synthetic Monitoring continues to be as important as ever in understanding the performance of your app or website. But your synthetic monitoring is only as good as the tool you're using and there are a lot of product choices. Since selecting the best one for you is critical, the choice can be overwhelming. Price, setup ease, accuracy, and more play a part in the best solution.
687. What is Data-Centric AI?
What makes GPT-3 and Dalle powerful is exactly the same thing: Data.
688. How I Created a Simpsons Dataset for Instance Segmentation
This post is about creating your own custom dataset for Image Segmentation/Object Detection. It provides an end-to-end perspective on what goes on in a real-world image detection/segmentation project.
689. How I Created a Zero Trust Overlay Network in my Home
Enabling a secure home automation experience, by creating a zero trust overlay network to access #HomeAssistant.
690. Privacy As A Service: The DuckDuckGo Application
[Full disclosure: this post was written in promotion of fourweekmba.]It was September 25th, 2008.
691. Tutorial: Swift and SwiftUI for Data Science iOS Development
Swift and SwiftUI for Data Science
692. How to Create World Leading Databases
Jason Repp is the SVP of HarperDB, a world-leading database and development platform that is leading the charge in terms of performance, flexibility, and ease.
693. Staring into the Black Mirror with The Most Connected Man on Earth
All throughout his day, Chris is connected to numerous sensors that collect the data that make up his life.
Thank you for checking out the 693 most read stories about Data on HackerNoon.
Visit the /Learn Repo to find the most read stories about any technology.