Some Notes
Saturday, February 17, 2024
Sunday, February 4, 2024
Tuesday, January 23, 2024
Clocks
There are two main problems to resolve: a) events needs to be totally ordered and b) detect concurrent events. Lamport timestamp addresses the total order problem; and vector clock helps to detect concurrent events.
Lamport clocks: https://www.youtube.com/watch?app=desktop&v=cc3vjDHWYx8 ,
Thursday, January 26, 2023
Masterminds of programming: Folks talking about why they designed certain programming languages
https://www.amazon.com/dp/0596515170
http://bobzhang.dscloud.me/Calibre%20Library/Biancuzzi%2C%20Federico_%20O%27Reilly/Masterminds%20of%20Programming%20%28197%29/Masterminds%20of%20Programming%20-%20Biancuzzi%2C%20Federico_%20O%27Reilly.pdf
Sunday, October 16, 2022
Primes and particles
Recently i realized, that primes are similar to "elementary particles" of arithmetic i.e. other numbers can be constructed from primes but primes cannot be constructed from other numbers.
In that regard, making a note of a paper i came across about how primes are related to the physical world
- https://www.quantamagazine.org/a-chemist-shines-light-on-a-surprising-prime-number-pattern-20180514/
- on Particles and Primes: https://arxiv.org/abs/1608.07175 . This paper is quite heavy but a good opportunity to introduce myself to the terminology outlined therein such as
- associative normed division algebras C and H (ie C (the complex numbers); H (the quaternions).)
Thursday, October 13, 2022
Friday, September 23, 2022
Linux Network Virtualization
https://www.ibm.com/cloud/blog/diagnosing-packet-loss-in-linux-network-virtualization-layers-part-2
Friday, May 27, 2022
Monday, April 18, 2022
Sunday, March 13, 2022
Federation architectures for identity: Full mesh, hub & spoke, hub-spoke with centralized login
Federation architectures for identity: https://wiki.geant.org/display/eduGAIN/Federation+Architectures
CDN caches, Traffic classes and Footprint desriptors
CDN caches:
https://groups.cs.umass.edu/ramesh/wp-content/uploads/sites/3/2019/12/Footprint-Descriptors-Theory-and-Practice-of-Cache-Provisioning-in-a-Global-CDN.pdf
Tuesday, October 5, 2021
Intro to virtualization
high level intro into virtualization types: https://virtualizationreview.com/articles/2014/10/14/7-layer-virtualization-model.aspx
Here is an intro to docker layers: https://windsock.io/explaining-docker-image-ids/ which in turn leads to a common but (as was unknown to me) mechanism called COW or copy-on-write. https://en.wikipedia.org/wiki/Copy-on-write
Sunday, July 11, 2021
Explorations into Type Theory
Came across this https://ferenc.andrasek.hu/papersybprx/Stephen_Yablo_circularity_and_paradox.pdf
which gave me the hint that if we are seeing a paradox, either a semantic one or a set-theoretic one, the problem lies in our language. Apparently if we describe it in a better language the paradox disappears, i.e. the confusion/fudge factor clears up.
Also, this led to some exploration into how people view Category Theory https://math.stackexchange.com/questions/2522116/is-category-theory-more-abstract-than-set-theory-or-proof-theory and this was a good example.
Apparently, this took me to a page ive looked at previously - which is this diagram. which seems to be an attempt by people to seek a theory of truth.
Friday, July 2, 2021
Coverage vs. Functional/Requirement Testing
Here are some examples where Testing differs from Coverage testing. I am assuming that the term "Coverage" means Code Coverage branch/loop coverage both. In summary testing is of the following types viz. Functional(Feature/Requirement) Testing, Code Coverage Testing.
- Overflows/Underflows of variables, for example the following function, coverage is just passing any values of a and b, not caring about how overflows for example if a*b > 32-bit, the return value is correct.
int foo(int a, int b) {
return a * b ;
}
- Handling Exceptions from lower layers: For example say we have the following code, where the innerfoo function can throw exceptions. A 100% coverage which mocks the innerfoo function will still miss the cases which are not handled i.e. coverage will test whatever is written - but won't test if whatever is written is enough from the functionality pov.
int foo() {
try {
innerfoo();
} catch( ExceptionName e1 ) {
// catch block
} catch( ExceptionName e2 ) {
// catch block
} catch( ExceptionName eN ) {
// catch block
}
}
- Coverage testing does not cover Range Testing, for example,
int foo(int a) {int b = innerfoo(a);
if (b < 100) {
//do something
}
}
in which case, while coverage will be 100% if we pass a value of a that returns a b
such that it goes in the if block. However, coverage testing will miss noting that the code for the
else is missing and if b >= 100 then it is not clear that the function foo will do the right thing
In summary thus, Coverage does not check functional/requirements.
- Testing Features vs. Testing code that is written/ Functional Testing/Feature Coverage/Requirement Coverage: Code coverage will cover all branches but it won't be able to find out how much of the code written actually matches what is desired from the feature i.e. if the feature is supposed to do X, Y, Z then the code does that is found out by feature testing - code coverage will only test if the testing is covering the written code. This is also called Functional Testing.
- When people refer to the word "Unit Testing" it effectively means doing feature/requirement testing for the function.
- When people refer to integration testing, it refers to unit-testing the unit-tests i.e. for example imagine there are 2 gears that are individually unit-tested but integration testing is unit testing the behavior of 2 gears locked-together. For example verifying that turning left gear anticlockwise will turn the right gear clockwise. In terms of code, coverage will verify the unit-testing of individual components but again as mentioned previously may not cover overflows/exceptions i.e. will verify whatever is written is fine in terms of code-logic and execution, but wont verify whatever is written is what we want!
Wednesday, April 21, 2021
Philosophy of Networking
https://networkologies.files.wordpress.com/2014/10/5-networkedmind.pdf is a way to look at the world from a "networking" pov :) . Not sure what it is, but posting this as a reminder for me to follow up
and a reminder to learn about "Object Oriented Philosophy"
Sunday, January 24, 2021
On the Complexity of Crafting Crash-Consistent Applications
https://research.cs.wisc.edu/adsl/Publications/alice-osdi14.pdf which is a comprehensive study of application level crash-consistency protocols built atop modern file systems.
Wednesday, October 21, 2020
Transactions vs. locks
https://makandracards.com/makandra/31937-differences-between-transactions-and-locking
One good insight this gave me, and in retrospect I think i knew about this, but making a note nevertheless that - the purpose of a lock is to ensure that only 1 thread accesses a critical piece of code. Thus a lock is in it's essence a queue with threadids.
Saturday, June 20, 2020
Thursday, June 4, 2020
Some myths of Maths
which took me to this video
https://youtu.be/fwtVlcO6s-Y
Friday, May 29, 2020
Monday, May 18, 2020
Tuesday, May 5, 2020
Saturday, May 2, 2020
Saturday, April 18, 2020
Saturday, April 4, 2020
Sunday, March 15, 2020
Network Isolation of Namespaces
https://lwn.net/Articles/219794/ (2007)
https://lwn.net/Articles/531114/ (2013)
- Modern descriptions:
https://jvns.ca/blog/2016/10/10/what-even-is-a-container/, and then newer discussion on network namespaces on linux: https://blog.scottlowe.org/2013/09/04/introducing-linux-network-namespaces/
"Each network namespace is a logically a separate networking stack, with separate addresses, separate firewall rules, separate qos policies etc.
This has nothing to do with systemd etc.
in this picture we created 3 pairs
- ns1_veth0, globalns_veth1 (1st pair)
- ns2_veth0, globalns_veth2 (2nd pair)
- ns1_veth1, ns2_veth1 (3rd pair)
https://etherarp.net/network-isolation-of-services-with-systemd/
Sunday, March 8, 2020
Saturday, March 7, 2020
Infra Layers
Got this picture from https://www.theregister.co.uk/2017/12/06/what_is_terraform/, and it was enjoyable for me to see how I moved from the layer above (@Uber) to a layer below (@OCI)
Thursday, March 5, 2020
Tuesday, March 3, 2020
Sunday, February 23, 2020
Processes on Linux
0 The Scheduler
1 The init process
2 kflushd
3 kupdate
4 kpiod
5 kswapd
6 mdrecoveryd
(https://unix.stackexchange.com/questions/83322/which-process-has-pid-0)
Saturday, February 22, 2020
Linux Networking
I got this picture from https://epickrram.blogspot.com/2016/05/navigating-linux-kernel-network-stack.html?m=0, and the idea is that this is how the interaction between the network card and the driver running on on the CPU (in the interrupt handler?) works. The kernel module copies into a buffer called skbuff, which looks like this:
https://opensourceforu.com/2016/10/network-performance-monitoring/ and the kernel populates (copies) the data into the sk_buff data structure which look like this:
Go to this link for some deep dive on Raw sockets: https://packetstormsecurity.com/files/72743/SOCK_RAW-Demystified.html
Monday, February 17, 2020
Wednesday, February 12, 2020
Sunday, February 9, 2020
A compilation of software anti-patterns
- Lingusitic anti-patterns - functions were names wrong/improperly http://www.ptidej.net/publications/documents/CSMR13d.doc.pdf
- Architecture anti-patterns: http://www.se.rit.edu/~swen-440/slides/instructor-specific/Kuehl/Lecture%2018.1%20Architecture%20Antipatterns.pdf
Wednesday, February 5, 2020
Concurrency Control
- You can avoid them, by employing a pessimistic locking mechanism (e.g. Read/Write locks, Two-Phase Locking)
- You can allow conflicts to occur, but you need to detect them using an optimistic locking mechanism (e.g. logical clock, MVCC) - this is essentially that same concept of "tokens" I was previously exposed to in the context of distributed locks.
Tuesday, February 4, 2020
Java
- Understanding JVM architecture https://medium.com/platform-engineer/understanding-jvm-architecture-22c0ddf09722
- Understanding JNI : https://docs.oracle.com/javase/9/docs/specs/jni/index.html (this is needed to understand how Java implements threads )
- Java Threads vs. Pthreads https://medium.com/@unmeshvjoshi/how-java-thread-maps-to-os-thread-e280a9fb2e06
- Guice: I liked this quick video which shows the basic ideas of Guice - modules and "injecting" dependencies into the global map of sorts using BIND, so that later we can use the appropriate class as needed. https://www.youtube.com/watch?v=fe1n8VIXZ-k
- https://technologyconversations.com/2014/06/18/build-tools/ Ant vs. Maven vs. Gradle
Thursday, December 5, 2019
Types of Models
- Iconic
- Analog
- Symbolic
- mathematical
- logical
- adhoc
Monday, December 2, 2019
Interesting Personalities in Distributed Systems and Netwoking
Similarly my interest in the fields of Signal processing, and computational theory and algorithms was guided by my teacher during undergrad. (Udayan Kanade), and subsequently I got interested in Computer Architecture, Video Encoding, Video Streaming, Algorithms, Optimization, Game Theory etc.
Over years, I have been trying to find out similar personalities that would get me interested in the fields of Networking and Distributed Systems. Here are some colorful personalities in the field of distributed Systems
(1) https://martin.kleppmann.com/
Sunday, December 1, 2019
Types of cache
(1) Lookaside Cache (note how Memcache uses the concept of Lease)
(2) Inline/Write through cache. (TAO is an example of a read-through, write-through cache)
https://blog.the-pans.com/different-ways-of-caching-in-distributed-system/
Wednesday, November 13, 2019
Measures of developer productivity
- lines of code (LOC) per unit time (Delorey, Knutson, & Chun, 2007; Maxwell, Van Wassenhove, & Dutta, Oct 1996)
- Function points per unit time (Delorey et al., 2007; Maxwell & Forselius, 2000).
- Number of diffs landed per unit time (Facebook, Uber)
Wednesday, November 6, 2019
Tuesday, November 5, 2019
Finding out what the world is working on
- https://www.tiobe.com/tiobe-index/ the languages that are popular today
- https://octoverse.github.com/projects the projects that are popular today.
Monday, October 14, 2019
Golang tools
Wednesday, October 9, 2019
Tuesday, October 8, 2019
Crypto Notes
GF(2^n) and it's use in AES: https://engineering.purdue.edu/kak/compsec/NewLectures/Lecture7.pdf (and uploading the pdf to my google drive, in case it goes missing in the future from the Purdue link)
Kubernetes Notes
- Getting Started: https://kubernetes.io/
- Cloud Native Computing Foundation: https://www.cncf.io/
- Course to learn Kubernetes: https://courses.edx.org/courses/course-v1:LinuxFoundationX+LFS158x+2T2019/course/. As a first step install the software called Minikube ( see Instructions here: https://kubernetes.io/docs/setup/learning-environment/minikube/), as a part of which you will need to install kubectl
$ make WHAT='cmd/kubectl'
+++ [1014 17:55:59] Building go targets for darwin/amd64:which shows up in the output directory as :
cmd/kubectl
./_output/local/go/bin/kubectlTo look at the source code ; go to the directory cmd; you will see the following files that have an entrypoint main defined, which points out all the different things that the kubernetes product is built out of.
./_output/local/bin/darwin/amd64/kubectl
- clicheck/check_cli_conventions.go:func main() {
- cloud-controller-manager/controller-manager.go:func main() {
- gendocs/gen_kubectl_docs.go:func main() {
- genkubedocs/gen_kube_docs.go:func main() {
- genman/gen_kube_man.go:func main() {
- genswaggertypedocs/swagger_type_docs.go:func main() {
- genyaml/gen_kubectl_yaml.go:func main() {
- hyperkube/main.go:func main() {
- importverifier/importverifier.go:func main() {
- kube-apiserver/apiserver.go:func main() {
- kube-controller-manager/controller-manager.go:func main() {
- kube-proxy/proxy.go:func main() {
- kube-scheduler/scheduler.go:func main() {
- kubeadm/kubeadm.go:func main() {
- kubectl/kubectl.go:func main() {
- kubelet/kubelet.go:func main() {
- kubemark/hollow-node.go:func main() {
- linkcheck/links.go:func main() {
- preferredimports/preferredimports.go:func main() {
- verifydependencies/verifydependencies.go:func main() {
Wednesday, October 2, 2019
iptables / netfilter on linux
the flow http://xkr47.outerspace.dyndns.org/netfilter/packet_flow/
ref: https://www.youtube.com/watch?v=iP8YWcvKDr0
Sunday, September 22, 2019
Tuesday, September 17, 2019
Thread local storage
how to become a good programmer
Write code 3 times !
(https://www.javaworld.com/article/2072651/becoming-a-great-programmer--use-your-trash-can.html)
Embedded C++
RTTI is used by dynamic_cast to figure out if a base ptr can be reinterpreted as a derived class ptr . look at https://blog.feabhas.com/2013/09/casting-what-could-possibly-go-wrong/
Also look at this https://cs.nyu.edu/courses/fall16/CSCI-UA.0470-001/slides/MemoryLayoutMultipleInheritance.pdf for description about virtual inheritance and memory layouts therein.
Saturday, September 14, 2019
ipfs
https://ipfs.io/
Further, when reading about IPFs i came across Merkle trees, which are incidentally also used in Git to reduce the time for finding out what has changed between 2 branches.
This further took me to this paper : https://people.csail.mit.edu/silvio/Selected%20Scientific%20Papers/Zero%20Knowledge/Zero-Knowledge_Sets.pdf,
Monday, September 9, 2019
How facebook scaled up memcached
https://www.usenix.org/system/files/conference/nsdi13/nsdi13-final170_update.pdf
Memcached is available on github at https://github.com/memcached/memcached
Ref: https://medium.com/@SkyscannerEng/journey-to-the-centre-of-memcached-b239076e678a and picture from there:
Another reference is https://medium.com/@Alibaba_Cloud/redis-vs-memcached-in-memory-data-storage-systems-3395279b0941,
Sunday, September 8, 2019
Containers from scratch
(1) chroot, and it does not have private namespaces
(1) creating "namespaces" with unshare
(2) entering namspace with nsenter
(3) network namesapces can be shared e.g. across containers.
(4) cgroup directories can be created in /sys/fs/cgroups, and then appropriate values configured. cgroups is a way for the kernel to have "controlled isolation"
Saturday, September 7, 2019
Why BGP is a better iGP
Note that normally, when BGP RIBs are exchanged by two routers, if both sit in the same AS, then while their RIB will show the path, it will be marked with an "i", and that path will not be advertised to the outside world. If we instead give an AS to each rack and then use BGP it can be still made to work because each ASN is considered a private ASN.
Wednesday, September 4, 2019
Job scheduling in Mesos and Kubernetes
and
https://stackoverflow.com/questions/43076831/dcos-cluster-resource-allocation-is-np-hard/43448790#43448790 (which suggests that Mesosphere Marathon uses first-fit bin packing algorithm)
Tuesday, September 3, 2019
Compute Platform comparison
Infrastructure as a Service (IaaS), Containers as a Service (CaaS), and Platform as a Service (PaaS).
Sunday, August 25, 2019
Messaging patterns in ZeroMQ
- Publish/Subscribe
- Synchronous Request/Reply
- Asynchronous Request/Reply
- Push/Pull
- Parallelised pipeline
Sunday, August 18, 2019
Twitter Infra
Periscope infra:
(from https://qr.ae/TWrUa5 ) and a video https://www.youtube.com/watch?v=xjC3ZKYG74g
- Wowza Media Systems for streaming
- PubNub for the chatroom
- Circle CI and Travis CI
- Fabric
- Iron.io
- Algoria for search and indexing
- Slack
Tuesday, August 13, 2019
Scaling globally
- Network scalability & service discovery.
- Compute scalability & virtualization
- Storage scalability
(A) Load Balancers
In summary I have seen load balancers of the following types:
- Proxy based load balancers
- L3 load balancing: DNS based load balancing via pools (round-robin) or via mapping changes (Akamai), or via Anycast (See this for how BGP makes this happen: https://www.imperva.com/blog/how-anycast-works/)
- L4 load balancing via HAProxy (SSL termination via NGINX)
- L7 load balancing via HAProxy and a sidecar like Muttley (Uber) , which is essentially based on Healthchecks, Traffic controller rules, and Zookeeper nodes that are maintained at a /zone/service/ level , and updated when a particular service is deployed to a machine.
- Client side load balancers:
- GRPC based load balancing is an example of client-based load balancing. Refer to https://github.com/grpc/grpc/blob/master/doc/load-balancing.md. (I believe this could be done using something like a muttley sidecar too)
- DNS based service discovery such as Mesos-DNS
- DNS based service discovery using SRV records (See this https://docs.citrix.com/en-us/citrix-adc/13/dns/service-discovery-using-dns-srv-records.html)
- Zookeeper based service discovery
Storage Scalability:
Refer to http://www.cloudbus.org/reports/DistributedStorageTaxonomy.pdf for a taxonomy of Distributed Storage Systems (DSS)
In summary, Distributed storage can be looked at from different perspectives. If we look at it from the point of view of "functionality" there is the following categorization:
- Archival: Provide persistent nonvolatile storage. Achieving reliability, even in the event of failure, supersedes all other objectives and data replication is a key instrument in achieving this
- General purpose Filesystem: Persistent nonvolatile POSIX compliant filesystem e.g. NFS, CODA, xFS,
- Publish/Share: More volatile, think peer-peer
- Performance: Operate in parallel over a fast network, typically will stripe data e.g. Zebra,
- Federation middleware: Bring together various filesystems over a single API
- Custom: GFS (combination of many of the things above
- DHT : Store the keys associated with a node in that node's DNS records (e.g. TXT record) and the node info is obtained via SRV record for that node (refer to : https://labs.spotify.com/2013/02/25/in-praise-of-boring-technology/)
There are 4 main categories of cluster workloads (ref: https://eng.uber.com/peloton/)
- Stateless jobs
- Stateful jobs
- Batch jobs
- Daemon jobs
Thursday, May 2, 2019
Turing completeness using Mov
https://drive.google.com/open?id=1cbnCSdBmkjEGxoScn2VtcR7SiC8hc45x
Sunday, March 24, 2019
Type Systems in Computer programs
Friday, March 15, 2019
An O(ND) Difference Algorithm and Its Variations
https://neil.fraser.name/writing/diff/myers.pdf which is considered the best general purpose diff algorithm. See this: https://github.com/google/diff-match-patch
Friday, December 14, 2018
The unwritten laws of engineering
Thursday, December 13, 2018
Wednesday, December 12, 2018
Tuesday, November 20, 2018
Semantic versioning
Friday, November 2, 2018
Wednesday, October 31, 2018
Facebook Infra Overview
Friday, October 5, 2018
Arguments against OOP
Thursday, September 27, 2018
HAProxy internals
Things to follow up on:
(1) How is zero copy done? Using the "splice" system call on Linux
(2) MRU memory allocator
(3) Accepting multiple accepts at the same time across different processing listening to different ports of course.
(4) Tree based storage e.g.making heavy use of the Elastic Binary Tree http://wtarreau.blogspot.com/2011/12/elastic-binary-trees-ebtree.html
Thursday, June 7, 2018
Tuesday, June 5, 2018
Design patterns in C++
Factory pattern:
https://www.oodesign.com/factory-pattern.html
Wednesday, April 4, 2018
Blog on machine learning
Sunday, March 4, 2018
Mobile Real-time video segmentation
Monday, February 19, 2018
Monday, January 22, 2018
AI Magazine
What got me interested in this is : https://www.quora.com/What-is-the-biggest-unresolved-problem-for-AI, which is the quest for a general purpose intelligence i.e. building systems that can help us come up with the next "theory of relativity", or think like a human etc. Today's AI is mostly focused on classification
Thursday, January 18, 2018
The set-theoretic multiverse
http://lumiere.ens.fr/~dbonnay/files/talks/hamkins.pdf
Monday, May 22, 2017
Doxatic logic & types of reasoners
Something i read long time back when i read Smullyan's books, but making a note because i love it. His types of reasoners is beautiful :)
Types of reasoners[edit]
- Accurate reasoner:[1][2][3][4] An accurate reasoner never believes any false proposition. (modal axiom T)
-
- A conceited reasoner with rationality of at least type 1 (see below) will necessarily lapse into inaccuracy.
- Consistent reasoner:[1][2][3][4] A consistent reasoner never simultaneously believes a proposition and its negation. (modal axiom D)
- Normal reasoner:[1][2][3][4] A normal reasoner is one who, while believing also believes he or she believes p (modal axiom 4).
- Peculiar reasoner:[1][4] A peculiar reasoner believes proposition p while also believing he or she does not believe Although a peculiar reasoner may seem like a strange psychological phenomenon (see Moore's paradox), a peculiar reasoner is necessarily inaccurate but not necessarily inconsistent.
- Reflexive reasoner:[1][4] A reflexive reasoner is one for whom every proposition has some proposition such that the reasoner believes .
- If a reflexive reasoner of type 4 [see below] believes , he or she will believe p. This is a parallelism of Löb's theorem for reasoners.
- Unstable reasoner:[1][4] An unstable reasoner is one who believes that he or she believes some proposition, but in fact does not believe it. This is just as strange a psychological phenomenon as peculiarity; however, an unstable reasoner is not necessarily inconsistent.
- Stable reasoner:[1][4] A stable reasoner is not unstable. That is, for every if he or she believes then he or she believes Note that stability is the converse of normality. We will say that a reasoner believes he or she is stable if for every proposition he or she believes (believing: "If I should ever believe that I believe then I really will believe ").
- Modest reasoner:[1][4] A modest reasoner is one for whom every believed proposition , only if he or she believes . A modest reasoner never believes unless he or she believes . Any reflexive reasoner of type 4 is modest. (Löb's Theorem)