Wednesday, August 14, 2019

New Era Of The World Order

By configuring my google account today, I found out I still have this blog. Long time no update since years ago.

I decided to pick it up and write some thoughts down to bits & bytes. Over the years I thought I have negative view of the world order and a lot of dark thoughts. After reading some tweets on recent activities, I finally understood how dark the internet has became. Mixed bots and real people from both side. The rage is real and enormous. Here are random thoughts and points.
  • people who has nothing to lose, has no fear
  • people who has a lot to lose, has too much pride and resource to admit defeated
  • it is lose and lose situation, and poor will only lose greater
  • the good news is this will turn to progress for future good
  • this will not end well for HongKong in short term

Saturday, July 30, 2016

Choosing CDN

CDN stands for Content Delivery Network, do you need it? Well, let's see.

1. Are you running a high traffic volume website and your users are distributed? Y/N

2. Is latency important for your users? Y/N

If you answer yes on both, then you should use CDN. Here is a list of items you should consider when choose a CDN

  • Shield your origin
  • Speed of purge cache
  • API support on normal use and configuration
  • Override path and cache behavior
  • support multiple origin
  • GeoIP and conditional based rewrite
  • WAF
  • Logfile stream to S3 or other services
  • admin support multi factor authentication
  • HTTP 2.0
  • Customer support (yes, you will need it)
  • apex domain support
  • Deploy new config speed
  • 3rd party integration
  • Ignore query paramaters
  • Community

I'm a long time CDN user, including some market leading vendors. Recently a migration to Fastly (varnish based solution) showed smaller players are not necessary slower.

To all friends out there, bias-free. Try out and see what's best for you!

Saturday, September 26, 2015

Victim of its own success


Wednesday this week, I believe four different people from different companies and departments told me that they are victim of their own successes. Coincident? Maybe, maybe not. It had me thinking over for few days now, why and how to avoid it?

I did some research and trying to understand what does it mean? Those companies have in common growth pain, being successful in business areas also lead to to debt in other area. Another common denominator is IT and cloud centric, as a part of problem or the solution.

Many companies suffer from unpredicted IT complexity. Systems and integrations tend to be function focused. Creative developers and system integrators can come up some really “wow or wtf” solutions. Some people believe documentation is the best solution. I believe focus on simplify the complicity and make system standard or KISS. Most of us are not that special snowflake and all those special variants won’t add up in the end. Imagine services are interconnected to each other to complete a business activity. Every one-off hack/deviant will increase fail rate for your next business changes.

IT landscape is like your home. It needs maintain and housekeeping. When your business is going strong, you might just purchases new furnitures/systems or maybe even a new house to satisfy your need. How ever your walls, water pipes and kitchen drain are like infrastructure services and should always work. You might drop your cloth / data here and there, but sooner or later, someone has to organize it. Finding the balance between housekeeping and growth is not always trivial. I'd rather do regular small/medium housekeeping than some serious renovation. This includes keep a right amount of tech debt. Risk assessment can help the organization to find how they are doing.

Last not least is people struggle over the process and position they are hired for. Not the mission and vision that they can contribute or capable. Bigger organizations usually have better processes but lack of personal responsibility to the matter. Smaller organizations are enthusiastic, but usually lack of efficient processes. Kaizen is one of many ways to improve that. Vision and mission need become measurable.


To avoid become victim of its own success. Keep it simple, balance growth and housekeeping, continues improvement over metrics, bring in people who want to be responsible for success, not the position. Be a hero of its own success!

Friday, September 18, 2015

Optimizing your cloud

Recently I've been "forced" to look into some of those cloud optimization services (no name disclosed here) by the upper management. 
After investigation, I came to the conclusion of we are doing pretty good on all aspects. How did we do it with/without purchases another services?
The key to success is understand your business activities on top of IT. Let's look common features from cloud optimzation services for AWS and how you can mastery it. 
  • Cloud Resource Optimzation - Enable Detailed billing report with resources and tags, and Detailed billing report in Billing. With this, you can easily sort your cost by tags, on multiple levels (department, environment, business unit, anything you can imagine). Enable cloudwatch or other monitoring to understand your utilization. Read your usage report monthly.
  • Scheduling & Automation - Invest a tool you or your team can manage. Chef, Puppet, Ansible or Stackdriver, maybe combines with Zapier or homebrew script, anything you feel comfortable with. Don't over/under do it. Scheduling & Automation is to avoid human error, in Chinese idiom "either success or failure boils down to the same person".
  • Disaster Recovery Management - Everything fails! If you are not Netflix and are able to run chaos monkey, at least have a disaster playbook for all IT components. What if Route53 is not working? What to do web servers are down? How to fix capacity problem on RDS? Remember to include likelihood and impact, this can help you to improve systems/services later.
  •  Multi AWS regions and accounts - If your business don't have geographical restriction. Feel free to use different AWS regions to gain better resilience. AWS offer excellent support for multi AWS regions. To manage multiple accounts, I'm traditional and conservative for security reasons. Don't do it from same/shared computer. Use different VM or isolated environment. Many company do it for disaster recovery/backup in another account.
  • Server Grouping & Bulk action - if you have done Cloud Resource Optimzation and Scheduling & Automation correctly, this is resolved already. If not, AWS cli + ansible will help you get very far.
  • Easy Server Interface - Go with AWS cli + ansible
  • Server Build Wizard - Many services are doing this for free or a small price. Personally, server should abstract more as-a-service. Understand your distro/OS, only tune when you must.
  • Permission Management - Security is never stronger than the weakest link. Build the security process before you considering a tool for Permission Management. Use least privilege princip and a password tool will keep you relative safe for a long while.
  • Image & Snapshot management - This is similar to Server Build Wizard, but in most case, simple script and AWS cli will get your very far. If not, simplify the complicity. Do you really need all the images and snapshots? Is it must have or nice to have?
  • Manage Clients & Billing - If you done tags and permission management correctly. This should not be much work. I'd introduce some log service or simple DB to store historical data points. E.g. ELK or graphite. 
To summary, I'm not saying those services/tools are useless. Every organization must understand business activities on top of IT first, before you can optimizing IT. Don't fall for the cloud-hype marketing tricks.
Finish this blog from the air. Safe flight above your cloud & keep the lights on!

Friday, September 11, 2015

Cloud, and what's next?

When I arrived Switzerland and went to some social activities to meet locals, most people including who work within IT need an explanation what is cloud and why. It is still a lot of misconceptions, but I think I managed to explain why after a while. This morning, a friend from far north asked me, is the cloud the next thing? I told him, cloud is actually now. Future is something else.

Look around all services, Something as-a-Service is literally everywhere. It is hard to say "I am not using any cloud services". Enterprises also start to catch up. Adoption is getting higher/faster than ever, that's why it is extremely hard to find skilled resources.

The road to the cloud can be complex, here are some strategical hints.
Seven perspectives to look over
  • Business Perspective
  • Platform Perspective
  • Maturity Perspective
  • People Perspective
  • Process Perspective
  • Operations Perspective
  • Security Perspective

Saturday, June 7, 2014

Positive energy

From computer to something completely different, but still relevant.

Walking on the street or anywhere that you can meet other people, if they weren't interacting with other people, most of them didn't have smile or a happy face. Maybe they were in some deep serious thoughts, or maybe they were just tired, stressed or worried about life, business or what ever bothering them. Still it felt like none positive energy come out of them. Opposite, kids usually have this curious and glad impression, even they are not interacting with other. This inspires me to look the world differently and discover new things around us. Most important to our personal well being and success is being positive.

When we get older, we are bitter like an india pale ale. Maybe because we understand how terrible the world is. Maybe because we forget how important the dream is and lost in pursuing the happiness.

In the past weeks, I have been focus on various different tasks. I had previously anxiety about not producing enough. After some serious thoughts, I understood the fact is we have limited time and we can only focus a few things. One simple way to maximise throughput is doing different tasks to keep brain excited. Secondly positive energy will help us to overcome all difficulties.

I saw Lion King with my daughter another day, even she didn't fully understand whole story yet, but she seems to understand "hakuna matata".

Saturday, January 25, 2014

RHEL 7 beta test

Some highlights from RHEL7 beta, planing to put one to production asap.


  • LIO kernel target subsystem - better storage support
  • LVM2 replace LVM with support of thin provisioning
  • No more 32 bits
  • XFS and Btrfs!
  • Swap memory compression
  • Fast block devices caching slower block devices
  • automate relocate processes and memory for NUMA performance improvement
  • APIC virtualization
  • Long list of kvm improvement, including hot plug vCPU
  • rgmanager, luci, ricci, ccs replaced with pacemaker (about time...)
  • ruby 2.0.0, python 2.7.5! It's almost latest and greatest now, but for how long?
  • co-pilot, performance tools. I cann't wait to dive in
  • chrony replace ntp
  • firewalld is a dynamic firewall daemon
  • OpenLMI (another puppet & chef altnative?)
  • OpenSSH chroot shell logins
  • Multiple required authentications
  • Need learn how systemd works
  • Gnome 3, with screencast support
  • DM-Multipath improved performance
  • linux container as another interesting lightweight replacement for kvm 
  • tickless kernel (great for battery life, also less CPU interrupt)


Thursday, January 16, 2014

Tuning tips

Over the years, improving existing system is one of my biggest passions, not only because I’m mainly working with infrastructure and operations. To understand the system complicity and improve it, it’s as challenging as black magic. I’d like to share some of my experiences in this area.

Step 1 – Overview. Before start using Google and search “tuning tcp/ip on tomcat” or what ever you need to improve. Create a holistic view, either top down or bottom up. From user to the database with all components are used in the workflow. You might find later something least expected is causing the latency. For example, a busy web cluster is doing million DNS lookups to same backend server because not using DNS cache. But delay is not caused by DNS lookups, it’s caused by the firewall between the layers randomly drop UDP packages.

Step 2 – Measure. Tuning a complex system can drain enormous resource and you might feel hopelessness or cluelessness from time to time. Key to a success tuning is measuring. Not only from end to end, also between the services and applications. If you cannot measure it, you cannot improve it, more important you won’t know how much you’ve improved. Review the overview and measure the interesting paths. More data you collect, easier to find the bottleneck.

Step 3 – Bias Free. When the performance becomes an issue, people tend to blame unknowns to protect themselves. Also we love to attack the symptom rather than look couple steps further for the root cause. It might be perfectly reasonable action, but real gain is usually by resolving the root cause. In reality, we have short-term mitigation and long-term resolution. Bias will blind your judgment and instinct. In many practice cases, the problem symptom is only reflection of the real problem.

Last but not least, you need a great toolbox to complete your mission. Depending on situation, you might need different tools. Here are some of my favorites. logstash – a great central logging system
  • tcpdump, dsniff, wireshark – packet sniffing
  • graphite – graph tool, perfect for time-series data
  • ab, siege – http load test tools
  • new relic, tracelytics – full stack performance insight tools
  • sysstat, vmstat, iostat, htop – OS level monitoring tools
  • strace, vagrind, systemtap – Deep OS level troubleshooting tools
  • iperf – TCP/UPD bandwidth measurement tool
  • charles – client side web debugging proxy


Tuesday, December 10, 2013

Escape Cloud prolog

After years working in cloud computing, I decided to write something about escape cloud. The reason is help people to understand and most important "the choices" when move in/out to/from cloud. What exit routes do you have. Why and how?

Thursday, October 3, 2013

IT Operations and innovations

When we talk about IT Operations, terms like IT budget, service desk, security, keep lights on, fire
fighting, ITIL usually pop up first in mind. IT Operations usually associate with conservative mind, old school and day to day tasks. In general, the "boring stuff".

No matter how new or skinny is the system/software/application. Once you put it into production and move on. It will eventually get old, buggy here or there. Even worth, it is just laying there without anyone using it at all. IT Operations can not just ignore it. Over the years, this become more and more serve and its a heavy burden for business and system engineers. Some study showed that we spend about 25-40% IT budget on just keep lights on & fire fighting.

That's why innovation is important for IT Operations, because we need align IT and business. We need to be smarter, we don't want death by overwork at nights. The list is long. "Untested and unreviewed infrastructure code is akin to running the nation's railways on untested and incompatible track, points and signals." Sadly but that is how most business' IT systems are today, but not necessary tomorrow.


Friday, August 30, 2013

Overenginering

In the real world, a designer or architecture might draw an amazing sculpture of building, then engineers tell them, "sorry man! this, that and those are impossible to do by physical laws."

In world of IT, business maybe want a simple service to record time and expense. Engineers or software vendor will add thousand of functions you don't need. Some of them are necessary, but most of them are not even close to nice to have.


Why?
First it's not only we have different playgrounds. Engineers are too narrowed focused and forget the balance. For example: Making everything configurable to "avoid hard-coding", with the effect that there's more (complex, failure-prone) application logic in configuration files than in source code.

Secondly, the tools and interfaces to application is constant changing. Engineers sometimes felt into latest and greatest trick. Lost focus on KISS and chase new cool shiny thing than actually business. Or opposite on suffering on legacy.

Third and last one, we treat the "IT stuff" like the dark voodoo magi. We blindfold ourselves, and believe how great we are doing.



Great reading on Overengineering and YAGNI

Tuesday, August 13, 2013

Passed Amazon’s New AWS Certified Solutions Architect Exam

As #1146. I wasn't feeling very well before the exam. Think I had low fever. Anyway, this is just a drop of a huge cloud.

Saturday, August 10, 2013

Work smart, not hard. Clear your communication.

It took me many years to understand difference between efforts and results, idealist vs pragmatic. Many people who I knew think I am working hard. Yes, maybe I do, but mostly I am trying to work smart instead.

For example, I created a ticket with some fancy/blur terms in it. Such as automation and provisioning. The technician got confused and asked for some time to look at it. I questioned what is needed to be looked at, then I understood the problem is those fancy words means nothing to him.

It took me only few minutes to explain what I meant and why it is called those fancy terms, because this "create-system-account-for-provision-new-server-with-chef-without-sysadmin-interactive" doesn't really have a standard name yet.

So lesson learn. Clear your communication, get rid off fancy word. Let people know what you want to do. How to do it. Share you wisdom. Be gentle, but confident. Stand by your decision and be proud of it. At end of the day, you have to eat your own dog food.

Saturday, July 27, 2013

Software doesn't solve the problem

Similar to "Guns don't kill people; people kill people." Many IT people believe new technology, or tool can solve the problem. Well, it does, but it is people with knowledge, experience, heart and passion who can solve the problem. New software is just an enhancement.

Friday, July 19, 2013

Fail cheap, fail fast, fail often

Between good, cheap, fast service, you can only choose two.
* Good & Cheap service wont be fast
* Good & Fast service wont be Cheap
* Fast  & Cheap service wont be Good

When it come to failing, you can do cheap, fast and often at same time. Contradiction, if you don't want them, it will cost you much more. I've been working with many companies and only a very few has understand this. People believe that they can success by avoiding failing. It's sad, but true.

No pain No gain!

Every step you fail make you closer to the success!