Jekyll2023-07-21T06:41:19+00:00https://www.valdas.blog/Valdas’ blogThoughts about software, big data and random stuffValdas MaksimavičiusAvoid these two Azure pricing mistakes!2020-07-28T00:00:00+00:002020-07-28T00:00:00+00:00https://www.valdas.blog/2020/07/28/azure-pricing-mistakes<p>What happens when you feel unconditional love for your cloud provider? You trust your cloud services too much. It leads to mistakes, and the most critical blunders are pricing and security. One would say it’s only one - pricing :)</p>
<p>Eventually, you miss your success criteria and budget estimates just like Apollo 13 missed the Moon.</p>
<p>I have no doubts you will agree with me - we don’t want to be anywhere close if that ever happens. <strong>Anything that can go wrong will go wrong.</strong></p>
<p>Here is one case study about my two pricing mistakes that had happened in Q4 2019. I am sharing the details with you, as I hope you can avoid my mistakes.</p>
<!--more-->
<p>By the way, even though my examples are about specific Azure services, you can make similar mistakes with other offerings / other cloud providers.</p>
<p>To efficiently use cloud services, you need to develop extra caution.</p>
<h2 id="heres-how-to-waste-1225-eur-on-databricks">Here’s how to waste 1225 Eur on Databricks</h2>
<p>First mistake is related to Databricks. Remember - storage is cheap, computation is expensive.</p>
<p>The workhorse for data engineering and data science in Azure is Databricks. The price is primarily based on clusters up-time.</p>
<p>When I calculate the price for Databricks, I try to figure out things like number of users, data amounts and complexity, work hours. And based on work hours, I can make initial assumptions how long the interactive clusters might run.</p>
<p>And here’s the magic (especially relevant for non-regular working hours): admins don’t need terminate the clusters at an agreed time. Instead, Databricks has an auto shutdown functionality.</p>
<p>During Databricks cluster creation, there is a simple checkbox, visible in the image below.</p>
<p><img src="/images/2020/07/dbr_cluster.jpg" alt="" class="img-responsive" /></p>
<p>When enabled, the cluster will terminate after the specified time interval of inactivity.</p>
<p>Inactivity? Great, that means the cluster will not run when there are no users and I will not pay for it.</p>
<p><strong>Not exactly!</strong></p>
<p>Databricks clusters terminate if there are no running commands or active job runs. It doesn’t matter if users are logged in or not.</p>
<p>What if there is a bug and the task got stuck running?
What if someone run and forgot about streaming listener?
What if someone executed “while True:” loop?</p>
<p>I understand the functionality, as you can’t just terminate a cluster once it’s running a piece of code.</p>
<p>But sometimes, your cluster might keep running because of some rubbish code that your team member executed a while ago.</p>
<p><strong>Lesson #1: Don’t trust auto-terminate blindly. Terminate manually if you don’t use your cluster.</strong></p>
<h2 id="heres-how-to-burn-over-3600-eur-with-azure-synapse-pools-sql-data-warehouse">Here’s how to burn over 3600 Eur with Azure Synapse Pools (SQL Data Warehouse)</h2>
<p>The second blunder - a classical start and forget scenario.</p>
<p>We had a small workshop with demanding business users. They regularly escalate even smallest issues to the C level. We call them “client’s from hell”.</p>
<p>I trust you can find similar users within your organization :)</p>
<p>Our “client’s from hell” are advanced data analysts, that specialize in SQL. They want to crunch dozens of terabytes.</p>
<p>After some planning, we decided to organize a small workshop and let them play with SQL Data Warehouse (Azure Synapse SQL Pools).</p>
<p>Initially, our team thought that DWU 500 might be enough. But one engineer wanted to impress our users and increased it to DWU 2000.</p>
<p><strong>The workshop was a huge success!</strong></p>
<p>We received positive feedback, we felt like heroes. And we hit a bar, of course.</p>
<p>What about Azure Synapse SQL Pool?</p>
<p>We remembered about it a week later. It run for nearly 144 hours, instead of 8.</p>
<p>Luckily, the client’s enjoyed the workshop and new possibilities, didn’t mind the increased cost it too much.</p>
<p><strong>Lesson #2: There are a few very expensive services. Azure Synapse SQL Pools is one of them. Use it with caution!</strong></p>
<p><img src="/images/2020/07/synapse_pricing.jpg" alt="" class="img-responsive" /></p>
<p>To sum up, don’t trust yourself, your team and your cloud provider blindly. Trust, but check.</p>
<p>Set up pricing alerts (<a href="https://www.valdas.blog/2019/10/27/azure-subscriptions/" target="_blank">I wrote a blog post about it</a>). However alerts are reactive and you get notified once something is wrong. It might be too late.</p>
<p>Secondly, use all services with caution. Set up reminders in your Outlook to verify usage from time to time. Double check your automation scripts. Encourage your team to shut down resources and don’t rely on auto-shutdown functionalities blindly.</p>
<p>I recorded a short video about it also. Check it out.</p>
<iframe width="560" height="315" src="https://www.youtube.com/embed/l096PDhxSuw" frameborder="0" allow="accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture" allowfullscreen=""></iframe>Valdas MaksimavičiusWhat happens when you feel unconditional love for your cloud provider? You trust your cloud services too much. It leads to mistakes, and the most critical blunders are pricing and security. One would say it’s only one - pricing :)
Eventually, you miss your success criteria and budget estimates just like Apollo 13 missed the Moon.
I have no doubts you will agree with me - we don’t want to be anywhere close if that ever happens. Anything that can go wrong will go wrong.
Here is one case study about my two pricing mistakes that had happened in Q4 2019. I am sharing the details with you, as I hope you can avoid my mistakes.Remote work after the pandemic2020-04-20T00:00:00+00:002020-04-20T00:00:00+00:00https://www.valdas.blog/2020/04/20/remote-work-after-pandemic<p>Should we continue working remotely after the pandemic ends? It is a question many companies and leaders are going to face soon.</p>
<p>In this post, I want to warn you: many won’t be as productive working remotely after the pandemic as they were during it. However, the below might not apply to teams who worked remotely long before the COVID-19.</p>
<!--more-->
<p>Memes are flying all over the internet on how COVID-19 led the digital transformation of many companies. Even the most conservative enterprises, where it seemed to be impossible to work from home, managed to allow work from home set up within a week or two. Work inboxes were flooded with tips and tricks on how to be productive at home. Povilas Korop beautifully summarized:</p>
<p><img src="/images/2020/04/povilas.jpg" alt="" class="img-responsive" /></p>
<p>Over the past years, I’ve been working with remote teams spread over the Nordics. Collaboration with distributed stakeholders requires way more effort, no doubt about that. But even if you master remote work tools to perfection, there is no technology to replace face to face discussion. For that reason, I traveled often to meet the teams, discuss problems and whiteboard solutions.</p>
<p>Now, I can’t travel and we are still quite productive. Does it mean I won’t need to meet my teams face to face ever again? Can I work out of my home forever now? NO!</p>
<p><strong>First of all</strong>, some kind of work is more suitable for remote than others. Technically speaking, Software Developers can work almost from any place in the world. But there are aspects related to security, client agreements, data location, legal stuff. Working from home increases risks related to</p>
<h3 id="even-if-you-got-approvals-to-work-from-home-during-the-pandemic-it-doesnt-mean-these-privileges-will-last-after-covid-19-ends">Even if you got approvals to work from home during the pandemic, it doesn’t mean these privileges will last after COVID-19 ends</h3>
<p><strong>Secondly</strong>, I’ve heard one argument: <em>“Now everyone works well because they have to stay at home. Wait till people can go outside and you won’t see them. Forget about producing deliverables and achieving project objectives on time”</em>. If you are dealing with such a professional attitude, that probably means saying goodbye rather quickly to such individuals.</p>
<p>It brings me to the <strong>third point</strong>, beautifully summarized by Randy Shoup in one of his talks - <a href="https://www.infoq.com/presentations/high-performance-remote-distributed-team/" target="_blank">“High Performance Remote and Distributed Teams”</a>.</p>
<blockquote>
<p>“A strong recommendation that I have reinforced over and over again in my experience is a team should be collocated in a physical space to take advantage of that, or everybody on the team behaves as if they’re remote, even if they happen to be in the same place. The anti-pattern or the ones that don’t match this are mostly people are in the same site, except for Randy who’s remote. That never works. It works for a little while if you already know you could trust Randy, but over time, the isolation and the separation is very difficult.” - Randy Shoup</p>
</blockquote>
<h3 id="in-other-words-to-be-efficient-a-team-has-to-be-fully-onsite-at-the-office-or-all-members-connect-remotely">In other words, to be efficient, a team has to be fully onsite at the office or all members connect remotely</h3>
<blockquote>
<p>When you have humans all together, we are social animals, and so it’s going to be very difficult to avoid having the local quick architecture conversation and design conversation, and then informing Randy later, or maybe even forgetting to inform Randy later. We’ve all had that experience, that doesn’t work. That’s why I strongly say that at the team granularity, squad, whatever you want to say, it should be all in single site or remote-first.</p>
</blockquote>
<h3 id="its-so-easy-to-make-decisions-over-a-cup-of-coffee-its-even-easier-to-forget-to-inform-a-remotely-working-colleague-about-it">It’s so easy to make decisions over a cup of coffee. It’s even easier to forget to inform a remotely working colleague about it.</h3>
<p><strong>Last but not least</strong>, you consider yourself an introvert and you enjoy working from home? You do hope it will last forever? You might be interested in psychological studies on mistakenly seeking solitude. Dr Laurie Santos in <a href="https://www.happinesslab.fm/season-1-episodes/mistakenly-seeking-solitude" target="_blank">her podcast The Happiness Lab Episode 4</a> explores ways in which talking to people can bring us all genuine joy.</p>
<h3 id="even-the-hardest-introverts-will-benefit-from-some-interactions">Even the hardest introverts will benefit from some interactions.</h3>
<p><strong>To summarize</strong>, remote work requires extra effort to be trully productive remotely. Some can’t wait to go back to the office, chat with others over a cup of coffee, do work, enjoy workplace without kids running around. Others got used to remote and confirmed themselves they can deliver out of homes. Some work harder, not being able to distinguish between private and work time. Some find extra time and they don’t need to sit at the office once they finish the work.</p>
<h3 id="two-tips-i-suggest-going-forward">Two tips I suggest going forward:</h3>
<ol>
<li>If possible, avoid situations of having individuals working remotely and the rest of the team in the office. Rather choose days when everyone works remotely.</li>
<li>Stop making important decisions over a cup of coffee. Every work-related meeting should be via Teams/Slack and document your agreements. For example, use Architecture Decision Records to track system changes.</li>
</ol>
<hr />
<p>One of the best resources on remote work I’ve found so far, has been created by the Toptal team <a href="https://www.toptal.com/remote-work-playbook" target="_blank">The Suddenly Remote Playbook</a>.</p>Valdas MaksimavičiusShould we continue working remotely after the pandemic ends? It is a question many companies and leaders are going to face soon.
In this post, I want to warn you: many won’t be as productive working remotely after the pandemic as they were during it. However, the below might not apply to teams who worked remotely long before the COVID-19.Free notebook projects to explore2020-04-18T00:00:00+00:002020-04-18T00:00:00+00:00https://www.valdas.blog/2020/04/18/free-notebooks<p>Notebooks (meaning interactive computational environments, not notepads or laptops) are taking over conventional data tools. Here are free interesting notebook projects to explore:</p>
<!--more-->
<h3 id="observable">Observable</h3>
<p><a href="https://observablehq.com/" target="_blank">Observable</a> allows to run data science and visualizations by using customized JavaScript. Use Apache Arrow, Vega, d3, Tensorflow and share beautiful results with others.</p>
<p>To get the art of possible, check out this <a href="https://observablehq.com/@lounjukk/covid-19-corona-virus-deaths-per-1-000-000-people" target="_blank">notebook.</a></p>
<p><img src="/images/2020/04/observable.jpg" alt="" class="img-responsive" /></p>
<h3 id="databricks-community">Databricks Community</h3>
<p>Think of Databricks if you want to explore Apache Spark, Delta Lake, MLflow or Koalas. <a href="https://community.cloud.databricks.com/" target="_blank">Databricks free community edition </a> offers 15GB memory driver, no worker node.</p>
<p><img src="/images/2020/04/databricks.png" alt="" class="img-responsive" /></p>
<h3 id="jupyterlab">JupyterLab</h3>
<p><a href="https://mybinder.org/v2/gh/jupyterlab/jupyterlab-demo/master?urlpath=lab/tree/demo" target="_blank">JupyterLab</a> calls itself a next-generation web-based user interface for Project Jupyter. Without further ado, check it yourself</p>
<p><img src="/images/2020/04/jupyter.jpg" alt="" class="img-responsive" /></p>
<h3 id="binder">Binder</h3>
<p><a href="https://mybinder.org/" target="_blank">Binder</a> is my latest finding. You can specify a git repo with interesting notebooks, Binder builds a docker build for you to interact with your notebooks. Great for trainings and workshops!</p>
<p><img src="/images/2020/04/binder.jpg" alt="" class="img-responsive" /></p>Valdas MaksimavičiusNotebooks (meaning interactive computational environments, not notepads or laptops) are taking over conventional data tools. Here are free interesting notebook projects to explore:Cloud Analytics #5 - Gartner Magic Quadrants, AI examples, space explosions and more (February 2020)2020-03-04T00:00:00+00:002020-03-04T00:00:00+00:00https://www.valdas.blog/2020/03/04/cloud-analytics-news-february<p>Here is a monthly list of my latest findings in the data, AI, cloud topics. Plus, some futuristic content that I’ve dug up in different places.</p>
<p><img src="/images/2020/03/february.png" alt="" class="img-responsive" /></p>
<!--more-->
<h3 id="gartner-magic-quadrants">Gartner Magic Quadrants</h3>
<p>In February, Gartner released a few updated technology reports. Unfortunately, I can’t provide you the links to the reports directly due to distribution licences, but instead I point you to download pages (reqistrations required).</p>
<ul>
<li>Magic Quadrant for Data Science and Machine Learning Platforms <a href="https://databricks.com/p/whitepaper/gartner-magic-quadrant-2020-data-science-machine-learning" target="_blank"> Register to get a report.</a></li>
</ul>
<p>There are the 6 vendors in the leaders quad, two that have been leaders last year (SAS and TIBCO) and 4 new ones (Alteryx, Dataiku, Databricks, MathWorks).</p>
<p>Please note, that the report includes only vendors with commercial products. Open-source platforms like Python and R, even though those are very popular with Data Scientists and Machine Learning professionals, are not included by Gartner.</p>
<ul>
<li>Magic Quadrant for Analytics and Business Intelligence Platforms <a href="https://info.microsoft.com/ww-landing-2020-gartner-magic-quadrant-for-analytics-and-business-intelligence.html?LCID=EN-US?ls=Website" target="_blank"> Register to get a report.</a></li>
</ul>
<p>Microsoft continues to lead a Magic Quadrant Leader in Analytics and Business Intelligence platforms with a feature packed Power BI solution.</p>
<h3 id="data-analytics-software">Data, analytics, software</h3>
<ul>
<li>Visual storytelling</li>
</ul>
<p>Various topics explained in informative visuals. <a href="https://www.visualcapitalist.com/chinas-113-cities-one-million-people-population/" target="_blank">For example, “Meet China’s 113 Cities With More Than One Million People”.</a></p>
<ul>
<li>
<p>50 AI Examples from the World’s biggest companies. <a href="https://www.manceps.com/ai-examples" target="_blank"> Read more.</a></p>
</li>
<li>
<p>Spark certification guide by Raki Rahman. <a href="https://www.linkedin.com/pulse/spark-simplified-certification-study-guide-raki-rahman/" target="_blank">Read more.</a></p>
</li>
<li>
<p>DevOps Roadmap: a step by step guide with all the latest and critical technologies every DevOps should know. <a href="https://roadmap.sh/devops" target="_blank"> Read more.</a></p>
</li>
</ul>
<h3 id="futurism-and-space-exploration">Futurism and space exploration</h3>
<ul>
<li>Astronomers Detect Biggest Explosion Since the Big Bang. <a href="https://futurism.com/astronomers-detect-biggest-explosion-since-big-bang" target="_blank"> Read more.</a></li>
</ul>
<p>The blast came from a supermassive black hole at the centre of a galaxy hundreds of millions of light-years away.
My collegue summerized it beautifully - “While we fight our battle with coronaviruse, millions of potential civilizations died in an explosion.”</p>
<p><img src="/images/2020/03/space_explosion.jpg" alt="" class="img-responsive" />
<sub>Credit: X-ray: NASA/CXC/Naval Research Lab/Giacintucci, S.; XMM:ESA/XMM; Radio: NCRA/TIFR/GMRTN; Infrared: 2MASS/UMass/IPAC-Caltech/NASA/NSF</sub></p>Valdas MaksimavičiusHere is a monthly list of my latest findings in the data, AI, cloud topics. Plus, some futuristic content that I’ve dug up in different places.Cloud Analytics #4 - Google’s new AI assistant, AI-infused products, and more (January 2020)2020-02-02T00:00:00+00:002020-02-02T00:00:00+00:00https://www.valdas.blog/2020/02/02/cloud-analytics-news-january<p>Here is a monthly list of my latest findings in the data, AI, cloud topics. Plus, some futuristic content that I’ve dug up in different places.</p>
<p><img src="/images/2020/01/january.png" alt="" class="img-responsive" /></p>
<!--more-->
<h3 id="podcasts-i-enjoyed-in-january">Podcasts I enjoyed in January</h3>
<ul>
<li>
<p>Artificial Intelligence in Industry by Daniel Faggella. Dan invites top-level leaders from different industries that specialize in AI. Short (up to 30 minutes) and insightful discussions that you can <a href="https://open.spotify.com/show/4gD9xiYU9iC24vnjUx1PTg" target="_blank">listen on Spotify.</a></p>
</li>
<li>
<p>Think Like a CEO. So what does a CEO do every day? After listening to these short discussions you will get a grasp of the CEO’s responsibilities. <a href="https://open.spotify.com/show/6MKnmWJ1dEGqosVVASDBOR" target="_blank">Listen on Spotify.</a></p>
</li>
</ul>
<h3 id="interesting-materials-on-ai-software-development-big-data-etc">Interesting materials on AI, Software Development, Big Data, etc.</h3>
<ul>
<li>The annual Consumer Electronics Show (CES) typically hosts presentations of new products and technologies. This year AI was an integral part of many new releases. AI-infused beauty products, smart washing machines, autonomous tractors, and cars. <a href="https://www.datanami.com/2020/01/10/ai-was-everywhere-at-ces/" target="_blank">Read more.</a></li>
</ul>
<p><img src="/images/2020/01/tractor.png" alt="" class="img-responsive" />
<sub>Deere’s Scan & Spray uses computer vision to only spray weeds with Round-Up (image courtesy Deere)</sub></p>
<ul>
<li>Google has created a more advanced conversational bot than existing Siri, Alexa, and Cortana. Meet Meena. The model has 2.6 billion parameters and is trained on 341 GB of text, filtered from public domain social media conversations.
<a href="https://ai.googleblog.com/2020/01/towards-conversational-agent-that-can.html" target="_blank">Read more.</a></li>
</ul>
<p><img src="/images/2020/01/meena.png" alt="" class="img-responsive" />
<sub>Source: https://ai.googleblog.com/2020/01/towards-conversational-agent-that-can.html</sub></p>
<ul>
<li>I hosted my first webinar about Data Engineering Patterns and Principles.</li>
</ul>
<iframe width="560" height="315" src="https://www.youtube.com/embed/m1ZmbnO1E4E" frameborder="0" allow="accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture" allowfullscreen=""></iframe>
<p><a href="https://www.dataplatformschool.com/webinars" target="_blank">You can subscribe to get notified on my future webinars about data engineering and cloud analytics </a></p>
<h3 id="futurism-and-space-exploration">Futurism and space exploration</h3>
<p><img src="/images/2020/01/spacex.gif" alt="" class="img-responsive" />
<sub>SpaceX copyrights</sub></p>
<p>SpaceX successfully completed the most difficult test of its Cargo Dragon human aircraft. The launch was intentionally aborted and it triggered Dragon’s automatic separation at supersonic speed. As Elon Musk stated, the test was extremely difficult and it pushed the envelope in many aspects. The final checks of the aircraft need to be executed before real humans fly to space. SpaceX aims to fly humans to the ISS by Q2. <a href="https://techcrunch.com/2020/01/19/spacex-successfully-completes-key-test-of-its-crew-dragon-human-spacecraft/" target="_blank">Read more.</a></p>
<p><img src="/images/2020/01/spacex2.gif" alt="" class="img-responsive" />
<sub>SpaceX copyrights</sub></p>Valdas MaksimavičiusHere is a monthly list of my latest findings in the data, AI, cloud topics. Plus, some futuristic content that I’ve dug up in different places.Cloud Analytics #3 - 2020 predictions, space lunch calendar and more (December 2019)2019-12-30T00:00:00+00:002019-12-30T00:00:00+00:00https://www.valdas.blog/2019/12/30/cloud-analytics-news-december<p>Here is a monthly list of my latest findings in the data, AI, cloud topics. Plus, some futuristic content that I’ve dug up in different places.</p>
<p><img src="/images/2019/12/december.png" alt="" class="img-responsive" />
<sub>Photo by Danil Aksenov on Unsplash</sub></p>
<!--more-->
<h3 id="interesting-materials-on-software-development-big-data-machine-learning-etc">Interesting materials on Software Development, Big Data, Machine Learning, etc.</h3>
<ul>
<li>As an educational Christmas present, Finland is offering a free online course on artificial intelligence. <a href="https://www.elementsofai.com/" target="_blank">Access it here.</a></li>
</ul>
<blockquote>
<p>The Elements of AI course is designed to encourage people to learn the basics of artificial intelligence, irrespective of their age or education. The original goal behind this course was to educate 1% of Finnish people, which amounted to roughly 55,000 people, in the basics of AI. This goal, however, was accomplished in just a few months. Therefore, the revised goal of this course is to train 1% of European citizens in the basics of AI by 2021.</p>
</blockquote>
<ul>
<li>
<p>One of my favorite reads lately is the ThoughtWorks blog. <a href="https://www.thoughtworks.com/insights/blog/year-review-editors-top-content-2019s" target="_blank">Here is a year in review - editors’ top content for 2019</a></p>
</li>
<li>
<p><a href="https://www.cathrinewilhelmsen.net/series/beginners-guide-azure-data-factory/" target="_blank">Awesome Azure Data Factory guide (25 articles!!!) by Cathrine Wilhelmsen</a></p>
</li>
<li>
<p>Are you interested in becoming a valuable data engineer?<a href="https://github.com/andkret/Cookbook" target="_blank"> Read “Data Engineering Cookbook” by Andreas Kretz, </a> and follow Andreas on <a href="https://www.linkedin.com/in/andreas-kretz/" target="_blank">LinkedIn where he hosts live streams and produces great content</a></p>
</li>
</ul>
<h3 id="2020-predictions-yes-it-needs-a-separate-section--">2020 predictions (yes, it needs a separate section :) )</h3>
<p><strong>Disclaimer:</strong> All “predictions” are just personal opinions. Keep it as a thought material.</p>
<ul>
<li>
<p><a href="https://www.datanami.com/2019/12/23/big-data-predictions-what-2020-will-bring/" target="_blank">Big Data Predictions: What 2020 Will Bring by Alex Woodie</a></p>
</li>
<li>
<p><a href="https://towardsdatascience.com/seven-important-predictions-for-big-data-in-2020-cb243115d36f" target="_blank">Seven Important Predictions for Big Data in 2020 by Marcel Deer</a></p>
</li>
<li>
<p><a href="https://thenextweb.com/syndication/2019/12/10/10-predictions-for-data-science-and-ai-in-2020/" target="_blank">Ten predictions for data science and AI in 2020 by Jason T Widjaja</a></p>
</li>
</ul>
<h3 id="futurism">Futurism</h3>
<p><img src="/images/2019/12/space_crew.jpg" alt="" class="img-responsive" />
<sub>NASA astronauts Doug Hurley (left) and Bob Behnken (right), who are scheduled to be the first people that SpaceX launches into orbit. SpaceX</sub></p>
<ul>
<li><a href="https://www.businessinsider.com/space-calendar-events-2020-astronomy-rockets-2019-12" target="_blank">Space launch calendar 2020</a></li>
</ul>
<p><img src="/images/2019/12/space_calendar.jpg" alt="" class="img-responsive" />
<sub>Original: https://www.rmg.co.uk/discover/explore/space-stargazing/space-exploration/mission-launch-dates-2020</sub></p>Valdas MaksimavičiusHere is a monthly list of my latest findings in the data, AI, cloud topics. Plus, some futuristic content that I’ve dug up in different places.
Photo by Danil Aksenov on UnsplashCloud Analytics #2 - Latest news about Azure, 8K holographic display and more (November 2019)2019-12-02T00:00:00+00:002019-12-02T00:00:00+00:00https://www.valdas.blog/2019/12/02/cloud-analytics-news-2<p>Here is a monthly list of my latest findings in the data, AI, cloud topics. Plus, some futuristic content that I’ve dug up in different places.</p>
<!--more-->
<p><img src="/images/2019/12/november.png" alt="" class="img-responsive" /></p>
<h3 id="interesting-materials-on-big-data-machine-learning">Interesting materials on Big Data, Machine Learning</h3>
<ul>
<li>
<p><a href="https://azure.microsoft.com/en-us/resources/cloud-analytics-with-microsoft-azure/" target="_blank">Free book by the Packt Publishing: Cloud Analytics with Microsoft Azure (requires registration)</a></p>
</li>
<li>
<p><a href="https://www.microsoft.com/en-us/ai/ai-lab-projects" target="_blank">AI Lab projects by Microsoft</a></p>
</li>
</ul>
<p><img src="/images/2019/12/ms_labs.jpg" alt="" class="img-responsive" /></p>
<ul>
<li><a href="https://dzone.com/articles/19-free-public-data-sets-for-your-data-science-pro" target="_blank">19 Free Public Data Sets for Your Data Science Project</a></li>
</ul>
<h3 id="interesting-releases-and-updates">Interesting releases and updates</h3>
<ul>
<li>
<p><a href="https://azure.microsoft.com/en-us/updates/azure-private-link-is-now-available-in-all-regions/" target="_blank">Azure Private Link enables connectivity to Azure SQL Database, Azure SQL Data Warehouse, Azure Storage, Azure Data Lake Storage Gen2 and your own private networks</a></p>
</li>
<li>
<p><a href="https://azure.microsoft.com/en-us/blog/microsoft-cloud-in-norway-opens-with-availability-of-microsoft-azure" target="_blank">Microsoft cloud in Norway opens with availability of Microsoft Azure</a></p>
</li>
<li>
<p><a href="https://azure.microsoft.com/en-us/services/synapse-analytics/" target="_blank">Azure Synapse Analytics replaces Azure SQL Data Warehouse</a></p>
</li>
<li>
<p><a href="https://azurecharts.com/" target="_blank">There is a new portal to track all Azure updates</a></p>
</li>
</ul>
<p><img src="/images/2019/12/azure_heat.jpg" alt="" class="img-responsive" /></p>
<ul>
<li><a href="https://azure.microsoft.com/en-us/ignite/" target="_blank">See all Azure announcements from Ignite</a></li>
</ul>
<h3 id="futurism">Futurism</h3>
<ul>
<li><a href="https://www.futuretimeline.net/blog/2019/11/23.htm" target="_blank">World’s first 8K holographic display</a></li>
</ul>
<p><img src="https://www.futuretimeline.net/blog/images/1820-3d-holographic-8k-technology.gif" alt="" class="img-responsive" /></p>Valdas MaksimavičiusHere is a monthly list of my latest findings in the data, AI, cloud topics. Plus, some futuristic content that I’ve dug up in different places.Cloud Analytics #1 - Latest news about Azure, quantum computing and more (October 2019)2019-10-27T00:00:00+00:002019-10-27T00:00:00+00:00https://www.valdas.blog/2019/10/27/cloud-analytics-news-1<p>Here is a monthly list of my latest findings, things I enjoy or ponder. I look into new tools and releases made by Microsoft, its competitors and others. I collect and share with you informative materials about data, AI, cloud. Last but not least, I add futuristic content that I’ve dug up in different places.</p>
<p><img src="/images/2019/10/october.png" alt="" class="img-responsive" /></p>
<!--more-->
<h3 id="new-releases-updates-from-microsoft-databricks-and-others">New releases (updates from Microsoft, Databricks, and others)</h3>
<ul>
<li><a href="https://databricks.com/blog/2019/10/17/introducing-the-mlflow-model-registry.html" target="_blank">New MLflow model registry to simplify model management. </a> During Spark & AI Summit Databricks announced new features to MLflow: a central place to share ML models, collaborate on moving them from experimentation to testing and production, and implement approval and governance workflow.</li>
</ul>
<p><img src="/images/2019/10/mlflow.png" alt="" class="img-responsive" />
<sub>Credit: Databricks</sub></p>
<ul>
<li><a href="https://powerbi.microsoft.com/en-us/blog/announcing-automated-machine-learning-in-power-bi-general-availability/" target="_blank">Power BI announced automated machine learning in general availability.</a> Feature recommendations, model explainability, controlling training time, improved training reports - solid step towards democratizing machine learning for all users.</li>
</ul>
<p><img src="/images/2019/10/powerbi.png" alt="" class="img-responsive" />
<sub>Credit: Microsoft</sub></p>
<ul>
<li>
<p><a href="https://docs.microsoft.com/en-us/azure/data-factory/transform-data-machine-learning-service" target="_blank">Azure Data Factory supports Azure Machine Learning service pipelines as a step.</a> Finally, one can execute Azure ML service pipelines without workarounds.</p>
</li>
<li>
<p><a href="https://azure.microsoft.com/en-us/updates/azure-data-share-now-supports-sharing-data-from-azure-sql-and-sql-dw/" target="_blank">Azure Data Share supports both structured data (from Azure SQL and SQL DW) and unstructured data (from Azure Data Lake Store and Blob storage) with centralized management and governance.</a> You can share tables and views, and data consumers can receive data in any of the following Azure data stores of their choice.</p>
</li>
<li>
<p><a href="https://towardsdatascience.com/netflix-open-sources-polynote-to-make-data-science-notebooks-better-8d6820535b25" target="_blank">Netflix open sourced Polynote notebooks - an alternative for Jupyter.</a> In earlier posts, Netflix shed some light on how they work with ML and productionalize notebooks: <a href="https://medium.com/netflix-techblog/notebook-innovation-591ee3221233" target="_blank">Beyond Interactive: Notebook Innovation at Netflix</a> &
<a href="https://medium.com/netflix-techblog/scheduling-notebooks-348e6c14cfd6" target="_blank">Part 2: Scheduling Notebooks at Netflix</a></p>
</li>
<li>
<p><a href="https://sanddance.js.org/" target="_blank">Microsoft open sources SandDance, a visual data exploration tool.</a> It is available as an extension to both Visual Studio Code and Azure Data Studio and has also been re-released as a Power BI Custom Visual.</p>
</li>
</ul>
<p><img src="/images/2019/10/sanddance.gif" alt="" class="img-responsive" /></p>
<h3 id="interesting-materials-on-big-data-machine-learning">Interesting materials on Big Data, Machine Learning</h3>
<ul>
<li><a href="https://martinfowler.com/articles/cd4ml.html" target="_blank">Continuous Delivery for Machine Learning by Martin Fowler. </a> A solid read about for all Data Science & Data Engineering professionals!</li>
</ul>
<blockquote>
<p>Continuous Delivery for Machine Learning (CD4ML) is a software engineering approach in which a cross-functional team produces machine learning applications based on code, data, and models in small and safe increments that can be reproduced and reliably released at any time, in short adaptation cycles. - Martin Fowler</p>
</blockquote>
<ul>
<li>
<p><a href="https://databricks.com/blog/2019/10/23/spark-ai-summit-europe-19-recap.html" target="_blank">Spark + AI in Amsterdam: European Summit Recap, Keynote Videos, & Announcements.</a> One of the hottest Big Data & Data Science conferences in Europe took place in October in Amsterdam.</p>
</li>
<li>
<p><a href="https://databricks.com/blog/2019/09/18/productionizing-machine-learning-from-deployment-to-drift-detection.html" target="_blank">Productionizing Machine Learning: From Deployment to Drift Detection.</a> Good explanation of concept and data drifts - ways to detect and protect against it.</p>
</li>
<li>
<p><a href="https://www.microsoft.com/en-us/itshowcase/microsoft-tames-the-wild-west-of-big-data-with-modern-data-management" target="_blank">Microsoft tames the “wild west” of big data with modern data management.</a> A super interesting read on how Microsoft used its own tools (Azure Data Platform) to enable predictive and prescriptive analytical capabilities.</p>
</li>
</ul>
<p><img src="/images/2019/10/data_management.jpg" alt="" class="img-responsive" />
<sub>Credit: Microsoft</sub></p>
<h3 id="futurism">Futurism</h3>
<ul>
<li><a href="https://www.futuretimeline.net/blog/2019/10/24.htm" target="_blank">Google claims quantum supremacy.</a> Google has announced that its 53-qubit “Sycamore” processor has achieved quantum supremacy, performing a specific task in 200 seconds that would take the world’s best supercomputers 10,000 years to complete.</li>
</ul>
<iframe width="560" height="315" src="https://www.youtube.com/embed/-ZNEzzDcllU" frameborder="0" allow="accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture" allowfullscreen=""></iframe>
<p>I hope the most popular comment under that video is just a joke :)</p>
<p><img src="/images/2019/10/comment_google.jpg" alt="" class="img-responsive" /></p>
<ul>
<li>
<p><a href="https://www.sciencemag.org/news/2019/10/ai-allows-paralyzed-person-handwrite-his-mind" target="_blank">AI allows paralyzed person to “handwrite” with his mind</a> The brain activity helped train a computer model known as a neural network to interpret the commands, tracing the intended trajectory of his imagined pen tip to create letters.</p>
</li>
<li>
<p><a href="https://www.futuretimeline.net/blog/2019/10/9.htm" target="_blank">20 new moons of Saturn.</a> Astronomers report the discovery of 20 new natural satellites of Saturn – taking the planet’s known number to 82, surpassing Jupiter, and pushing the total count for the Solar System above 200.</p>
</li>
</ul>
<p><img src="/images/2019/10/new-moons.jpg" alt="" class="img-responsive" />
<sub>Credit: NASA/JPL-Caltech/Space Science Institute. Starry background courtesy of Paolo Sartorio</sub></p>Valdas MaksimavičiusHere is a monthly list of my latest findings, things I enjoy or ponder. I look into new tools and releases made by Microsoft, its competitors and others. I collect and share with you informative materials about data, AI, cloud. Last but not least, I add futuristic content that I’ve dug up in different places.Outrageous spending in Azure and how to avoid it2019-10-27T00:00:00+00:002019-10-27T00:00:00+00:00https://www.valdas.blog/2019/10/27/azure-subscriptions<p>A few weeks ago I spoke at Swedbank Seedtalks on Big Data & Analytics in Riga, where I shared my experience in building a Data Science environment on Microsoft Azure.</p>
<p>I met with professionals interested in workload migration to cloud-based services. Many were interested in reducing the risk of letting someone spin up resources and leaving them running for months before someone finds out. Here are some tricks to help you avoid unnecessary expenses.</p>
<!--more-->
<p>There are tools like a pricing calculator to estimate costs, budget and cost alerts, reviewing costs against your latest invoice. But all these are just soft methods to remind you about your cloud activity rather than hard stop rules.</p>
<p>If you want to shut down or delete resources based on your spending, I suggest creating budget action group triggers:</p>
<ul>
<li>Get notified by email/SMS when you reach 75% of your budget;</li>
<li>Trigger a script to shut down or delete resources (e.g. Automation, Functions, or Logic Apps) when you reach 100% of your budget</li>
</ul>
<p><b><a href="https://docs.microsoft.com/en-us/azure/billing/billing-getting-started" target="_blank">Intro into how to orevent unexpected charges with Azure billing and cost management</a></b></p>
<p><b><a href="https://docs.microsoft.com/en-us/azure/cost-management/tutorial-acm-create-budgets#trigger-an-action-group" target="_blank">Tutorial: Create and manage Azure budgets</a></b></p>
<iframe width="560" height="315" src="https://www.youtube.com/embed/UrkHiUx19Po" frameborder="0" allow="accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture" allowfullscreen=""></iframe>
<hr />
<p>Microsoft uses voting service for end-users to express their interest in new features. There are a few pricing related feature requests that you can upvote:</p>
<ul>
<li><a href="https://feedback.azure.com/forums/170030-signup-and-billing/suggestions/16437664-allow-setting-spending-limit-on-any-account-type" target="_blank">Microsoft feedback - Allow setting spending limit on any account type</a></li>
</ul>
<p><img src="/images/2019/10/azure_pricing.jpg" alt="" class="img-responsive" /></p>
<ul>
<li><a href="https://feedback.azure.com/forums/906772-cost-management/suggestions/17690389-azure-spending-limit-pr-resource-group-instead-of" target="_blank">Microsoft feedback - Azure spending limit pr. Resource Group instead of pr. subscription</a></li>
</ul>
<p><img src="/images/2019/10/azure_pricing2.jpg" alt="" class="img-responsive" /></p>Valdas MaksimavičiusA few weeks ago I spoke at Swedbank Seedtalks on Big Data & Analytics in Riga, where I shared my experience in building a Data Science environment on Microsoft Azure.
I met with professionals interested in workload migration to cloud-based services. Many were interested in reducing the risk of letting someone spin up resources and leaving them running for months before someone finds out. Here are some tricks to help you avoid unnecessary expenses.Astrophysics for People in a Hurry by Neil deGrasse Tyson2019-05-12T00:00:00+00:002019-05-12T00:00:00+00:00https://www.valdas.blog/2019/05/12/astrophysics<p>A red giant star and a white dwarf star sit at the centre of the nebula, forming a binary “hourglass-shaped” star system. The stars perform, as described by NASA, a “gravitational waltz”.</p>
<p><img src="/images/2019/05/Hubble-Space-Telescope-Southern-Crab-Nebula.jpg" alt="" class="img-responsive" />
<sub>Hubble: The Southern Crab Nebula is named after its resemblance (Image: NASA)</sub></p>
<!--more-->
<p>The cosmic perspective comes from the frontiers of science, yet it is not solely the provenance of the scientist. It belongs to everyone.<br />
The cosmic perspective is humble.<br />
The cosmic perspective is spiritual—even redemptive—but not religious.<br />
The cosmic perspective enables us to grasp, in the same thought, the large and the small.<br />
The cosmic perspective opens our minds to extraordinary ideas but does not leave them so open that our brains spill out, making us susceptible to believing anything we’re told.<br />
The cosmic perspective opens our eyes to the universe, not as a benevolent cradle designed to nurture life but as a cold, lonely, hazardous place, forcing us to reassess the value of all humans to one another.<br />
The cosmic perspective shows Earth to be a mote. But it’s a precious mote and, for the moment, it’s the only home we have.<br />
The cosmic perspective finds beauty in the images of planets, moons, stars, and nebulae, but also celebrates the laws of physics that shape them.<br />
The cosmic perspective enables us to see beyond our circumstances, allowing us to transcend the primal search for food, shelter, and a mate.<br />
The cosmic perspective reminds us that in space, where there is no air, a flag will not wave—an indication that perhaps flag-waving and space exploration do not mix.<br />
The cosmic perspective not only embraces our genetic kinship with all life on Earth but also values our chemical kinship with any yet-to-be discovered life in the universe, as well as our atomic kinship with the universe itself.</p>
<p><strong>Fragment taken from Astrophysics for People in a Hurry by Neil deGrasse Tyson</strong></p>
<iframe width="560" height="315" src="https://www.youtube.com/embed/7g4ZZbVSpdo" frameborder="0" allow="accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture" allowfullscreen=""></iframe>Valdas MaksimavičiusA red giant star and a white dwarf star sit at the centre of the nebula, forming a binary “hourglass-shaped” star system. The stars perform, as described by NASA, a “gravitational waltz”.
Hubble: The Southern Crab Nebula is named after its resemblance (Image: NASA)