Blog

Your blog category

The Geospatial Capabilities of Microsoft Fabric and ESRI GeoAnalytics, Demonstrated

The saying goes that 80% of data collected, stored and maintained by governments can be associated with geographical locations. Although never empirically proven, it illustrates the importance of location within data. Ever growing data volumes put constraints on systems that handle geospatial data. Common Big Data compute engines, originally designed to scale for textual data,

The Geospatial Capabilities of Microsoft Fabric and ESRI GeoAnalytics, Demonstrated Read More »

Strength in Numbers: Ensembling Models with Bagging and Boosting

Bagging and boosting are two powerful ensemble techniques in machine learning – they are must-knows for data scientists! After reading this article, you are going to have a solid understanding of how bagging and boosting work and when to use them. We’ll cover the following topics, relying heavily on examples to give hands-on illustration of

Strength in Numbers: Ensembling Models with Bagging and Boosting Read More »

Efficient Graph Storage for Entity Resolution Using Clique-Based Compression

In the world of entity resolution (ER), one of the central challenges is managing and maintaining the complex relationships between records. At its core, Tilores models entities as graphs: each node represents a record, and edges represent rule-based matches between those records. This approach gives us flexibility, traceability, and a high level of accuracy, but

Efficient Graph Storage for Entity Resolution Using Clique-Based Compression Read More »

Survival Analysis When No One Dies: A Value-Based Approach

Survival Analysis is a statistical approach used to answer the question: “How long will something last?” That “something” could range from a patient’s lifespan to the durability of a machine component or the duration of a user’s subscription. One of the most widely used tools in this area is the Kaplan-Meier estimator. Born in the

Survival Analysis When No One Dies: A Value-Based Approach Read More »

Get Started with Rust: Installation and Your First CLI Tool – A Beginner’s Guide

Rust has become a popular programming language in recent years as it combines security and high performance and can be used in many applications. It combines the positive characteristics of C and C++ with the modern syntax and simplicity of other programming languages such as Python. In this article, we will take a step-by-step look

Get Started with Rust: Installation and Your First CLI Tool – A Beginner’s Guide Read More »

Non-Parametric Density Estimation: Theory and Applications

In this article, we’ll talk about what Density Estimation is and the role it plays in statistical analysis. We’ll analyze two popular density estimation methods, histograms and kernel density estimators, and analyze their theoretical properties as well as how they perform in practice. Finally, we’ll look at how density estimation may be used as a

Non-Parametric Density Estimation: Theory and Applications Read More »

Rethinking the Environmental Costs of Training AI — Why We Should Look Beyond Hardware

Summary of This Study Hardware choices – specifically hardware type and its quantity – along with training time, have a significant positive impact on energy, water, and carbon footprints during AI model training, whereas architecture-related factors do not. The interaction between hardware quantity and training time slows the growth of energy, water, and carbon consumption

Rethinking the Environmental Costs of Training AI — Why We Should Look Beyond Hardware Read More »