Spark RDD coalesce() is used only to reduce the number of partitions. This is optimized or improved version of repartition() where the movement of the data across the partitions is lower using coalesce.
- Coalesce uses existing partitions to minimize the amount of data that’s shuffled.
- Coalesce is not doing full scan and can reduce the partition size from original partition size.
- Coalesce can only decrease the number of partitions and can create uneven partitions.
- If you go from 1000 partitions to 100 partitions, there will not be a shuffle, instead, each of the 100 new partitions will claim 10 of the current partitions and this does not require a shuffle.
case:- 1
Let’s
take a situation like this, you have initially created an RDD and it
has N partitions and on that RDD you have applied filter transformation,
spark applies transformation on the partitions of RDD so if in case the
data inside a partition is completely filtered out then also spark will
maintain the number of partitions as the same as it has while creating
the RDD initially, this scenario is same for all the narrow
transformations(Transformations where shuffling is not required).
val rdd= sc.parallelize(1 to 4)
rdd.getNumPartitions
rdd.getNumPartitions
val rdd1= rdd.filter(x => x%2 == 0)
rdd1.collect
rdd1.collect
rdd1.getNumPartitions
![](data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAABHUAAACKCAIAAACre+GlAAAAA3NCSVQICAjb4U/gAAAgAElEQVR4Xu3df1BUZ77n8aMRaCWRdnKxCc6lyUwhlO5K37tLk9ESqLoFpvQCbl073jKizl06dyqCkymBqRLYJCW4ewU3PyTeKaFqVUhqFP8QqFARdrcQK67Arb1grZZI3QltFQR0JmmSQRswcb/NgabpbppufsgP36dmCJzznOc8z+t0qvqT5znPWfH06VOFDQEEEEAAAQQQQAABBBBAYNYCK2ddAxUggAACCCCAAAIIIIAAAgjYBchXfA4QQAABBBBAAAEEEEAAgbkRIF/NjSO1IIAAAggggAACCCCAAALkKz4DCCCAAAIIIIAAAggggMDcCJCv5saRWhBAAAEEEEAAAQQQQACBVV4IOjs7Hzx48PjxYy9lOIQAAggggAACCCCAAAIIPIcCq1ev3rBhw89+9jPnvq+Yan12CVd6vX5hmXp7e1988cWFbQNXRwABBBBAAAEEEEAAAQScBX788ccffvhBBqIksEjE+vnPf+44OuX4lRRd8HwlrdRoNNxLBBBAAAEEEEAAAQQQQGDxCEi+evLkibRHp9NJbvIpX42MjCyGDqxaNWUCXAzNow0IIIAAAggggAACCCDwvAnIHMAVK1bIEFZgYKDL41SLPb2sXMkKHM/bx5X+IoAAAggggAACCCCwqAUkX0lOkU1SlktDSS+L+s7ROAQQQAABBBBAAAEEEFhsAhKr1M29YeQrdxP2IIAAAggggAACCCCAAAIzESBfzUSNcxBAAAEEEEAAAQQQQAABdwHylbsJexBAAAEEEEAAAQQQWDoC1pbq6uZO69Jp8LJu6WJf32JZ49M5BBBAAAEEEEAAAQRmLWC739EqW0PEdpMpJTaM1xvNWnQ2FTB+NRs9zkUAAQQQQAABBBBAYJEIDN2/XnWqsLS6xWJbJC16LptBvnoubzudRgABBBBAAAEEEFieAv2tl8sKi8obmC/o9f7aLLWnc3NzTzf0uYTRvubTBbm5pbWdMw2p5Cuv8BxEAAEEEEAAAQQQQGDJCQzca6woLjhd3eKaHua3JzbfMomPxeavrRKuysqu35cL3G88Ve4UsSRcldXdH1KU/usVZTOMWOSr+btx1IwAAggggAACCCCAwIIJDN1vvSzzBSufzdIX1o7KgsLCosoOr+ts2PoaTucWFpbWLuQcRltn7fV+x21xRCxHuFIP9V+v896XKW4s61tMAcNuBBBAAAEEEEAAAQSWvkD/rbqKWx6XvuirLTp1fcBLD4O2ZBZkRPuyXIaEq9KqWzLuM3SrqqxSycqI1XqoV8JV+alG+7BR//WyMuVoVto0a3FYW06XXrYPJ/m0RSQfzU4J86GoJjrFqKtonRSxyjp1/fcn9kgtIcZUj72Y7gKMX00nxHEEEEAAAQQQQAABBJa2gMelL7RGY4S3boXEGvW+hCtFsbY22sOVug1IxKp2H/mZCFdqMRkemnYMq/+Wz+FKKr3f2e/b/ERFE23KyjTqnDs/OVw9XRv35uE9G4Oeum/exEaPka+mJaIAAggggAACCCCAAALLQGBs6YuWPrUvmrAUc1byFBErxJiZY/Jp7Eoq06ZkShiZEBpotUcsp7DjGq4URWfM9G1obJ7c3SPWRJRa+x/3vf13m198ItvI+Gb/48mTH8c3tbDHtpGvPLKwEwEEEEAAAQQQQACB5ScQpIveop+Yu6fRe4xYfoWrUSRtvDnHNWKVjkesvuaxaYHjnhKusnwJbyFap9A27c0I0vo22jZekUSszD32eOlIVpKefvjh5cTMtJjVQzab7fHjx/JT3dTfh0Y3yVyOrCXnujSM56+mvVMUQAABBBBAAAEEEEBgyQsEbZTXD6e5PVI0GrGU8rLR56Lsm//hSj3PHrGU8tLL98ZnCsooVqliM4W0VjjqtpfUbc/MSvNtZCzMVHDUaBnwadafJkSnn+ZxrrEOOv7R13Kh1iL5SGKV0zzAr7849/mr+xP+IsA+VuWITytWrFg5vq1ateqFF16QahxBS4pJAbXiZZWvVq9erfZK8qUrH38jgAACCCCAAAIIIPB8Cui2pKaaEqbMNE4Ra6bhSnX1ELEuV0wi9yNcjZ6nCdNH+7Jkhd/39enXNz8uu2x5PJasRkeufpCfoaGh/2FTwMj3gyvXv6SMTghUq5ZspcYqiVIyoqXmLvmpRiw5VwpIxJJtWeUrv105AQEEEEAAAQQQQACBZSwQFCGDVimx0w7s2CNWvt5ikzTj3yQ7Vzu3iOVUwN9w5Vr3nP0t4eqj09VquFKT1crV2r1/mxwRGRkUFGTPSKtWrQwMflm3NnDV2ONUUkyilEwOfPTokbQjICBAhrD++Mc/Do9vavqyx7A5ayYVIYAAAggggAACCCCAwGIRCNmYLNEq2tNC6R6bqNFGR3s84O9OzxFr8YSrp4//78VLlsdP1WQlQ09pf/ef/+rfRwS8oLywQv5vn/hn334c+n7k6cvjaUmCk0Qv2dauXSsHJWVZrVb5XR7E6uzslNlzckgNXeQrfz8wlEcAAQQQQAABBBBAYDELBOmMqftT42c5EjWbHrpGrEUUrp7+8EPAz36x6YV/6xj64YfNmzf/beru4ODVf/HyTyQg/Tj03bfWx0/Unq8IDA4cz1puFmvWrJFHk2Si4LfffvuLX/xCXV1Qo9EEBgaSr9y02IEAAggggAACCCCAwNIUmGIRiwXozETEWjThyr5gxeg0vxXRe8z/6XHX9xF/bYxfq133stY+JCXbyqC167SKPWKtCFz78jrNlPFqzFMi2fr162Uga3BwUOKWjGLJE1nkqwX4tHFJBBBAAAEEEEAAAQTmWEAWsTCZEnx8I/AcX9tzdfaIJc90af1d189zbbPc6whX6jNUur/OCLHZwl55JSgw0LlmiVgvhwb/uPIF319jJWNZISEhvb29wcHBMm+QfDXLO8XpCCCAAAIIIIAAAggsqIDWaNq/PXr6RSwWoJHaMP0CXNXtkury6zKLT1ajkFEmSUTy+1/+5V/KU1VuZWUYy49wpZ4u0wIjIiIkYsmIlqcaPVzkOdhla0iRUb20Bm/r6/tS5jmgWhZdtDZnx2pXaxNLnV8tPoOezVU9M7g0pyCAAAIIIIAAAqMCGn3sogxXi+f2yOCVBCqZGaguRCHPTb3yyiuew9VMGy21hYeHy0TBhchXj7/pudv2v3//z+/n5OT897bvZtqHxXKe1dLRUFlqTomUeBZf2bcYmmVtKS8orbV4aoqtI9fe0NjSTuejfZXxq1dH5s4yarhfz3I6Vi7mtGnDohMzCmo7vaVY92r82eOl75Oq6W9okJffDbVW37J6rH6u6vFYOTsRQAABBBBAAAEEnpWAOjNQ8pXEKklB8suchyu1K2rEevbzA0fuVp2omPTl/lnRzst1bA0HYtIb56XqGVdqs1wuPlVuTMlKm3IC7r2iwtrMS2k+r9c547bIiUHGo6czR1f7tPV3tjZUV53ae7nxZPu17CnfcTeLq/nQd7X26JxLZ7XV/RszTB5fWTdX9cyiK5yKAAIIIIAAAgggMFsB55mBkq/kv/u/9NJLMp1vtvVOcf6CrB8YEGMu/a8jI8qT+1Xv//PtKVq2dHZrUmoff2uzKbbWA5E76hZJuwcsnsdkxpsXFBGh1OWWdqQUxc7uBXI+9Ve7xWTKGL+QOacgMzcmqaywqNVcmTAPV5+u744ma2MzcmKn7sBc1TP1FTiCAAIIIIAAAgggMN8Ckq/kDVcyZiXLWsj6E3I59QVW83fdhZgfOPrC44BVSsC8daslI2x1tH0GnM3SUJqRGK0dnaGmTSx3mjJnP/RGvH2u3GptZKK5vHXA/cu+L2WkE7LUvUajBM2yO3329kSHOc+mG/099rRTq/uay80pY4W0MtXudLPThERLqTobb92OOpn5VrdjnVNd8dXOMxdjc7I23i/LrfM8m9HW/IaIJVZPCml91TKFUJvRrM7sszWkSYnKlso3ou2A8dkNfdbmgtE5kpEpp73OM9TEpm4JUobuOWVAW2fD6ey0+Mix+2S/HS0uCdE2zT31qe/W6kRX3km6cgfnqh71wyCTDB23Kyw6xbVbo4qx8qm0dlRmy9v/7I3TRqfk1rrcF6kmW967Ps4Tm2Y+3TCPUyxn+UHmdAQQQAABBBBAYJEISLhSXyIsT15JvpKfP/nJT+a7bc9+fuB892i8/iFLf3PBgR2nbsn71UxZaVpbnyVki278qKU8LfbI9aEQ4/6jR426/ubLxTuq+l2a5kuZuepNX+0bhr11yvaskppUvc1SV5Zbdl2z5+OzGXqNo9W2zvK0+CPXNcY9mScLYsOUvubK4rwdhusX2i+pc9x0prM1WwYUZaAx+0DZfePxi/lbHJkxZIvTNLghq85csqciPbe4JfV0vHuu9LFXrYXpFn3q0UxLRUXFgfTmEI1xf5amqqwx761KU4vZ47Q7e82WVsuQEqTXjU1OtLWYY5Kq+nXb95gKMjeG2O41VpRXHUm6ZWtvyXZ5h7iXe+pT37Xbj1+4MH6bbZby7MLrrl2dq3rs/ax+I/5A3YC88eF4zpaQgVvVZRVHkuoaL7RcMjmvozPQWPZGcVmzbk9GwfEgS3N1RWPZ3vSIuy3ZY4VsnaWJSYX3IpKzSgq264KG+u8115XnpTfea7o7i3vn2m/+RgABBBBAAAEElp+AOjlQxq8kWcm0QOmgrO/nQzet3XftW3ef1WaViWoajVajDYuMsW+R0z5fs3zzVX/1gTSbLqvmq5IUt2/6fZUHJFzpMmvaT6eMEpmzC3JK418rvDfh7UsZH+6Ob0UsdcV1AyH7m2pL1LSTkKLvk+mGjUr5pfE5dLbOsjeOXNfur2kvV9usKKYMU0JazIHswpbUcvt5Gn18iv1LuXVAetyvM6akTDUBb2AgKKEof0tMXl51/rUMNx/fGi2lMi5elRb36VtfPdJvvNRenqC1xra+cqCj+Z7N7HhjuM3W39/XZ/8oSzZorS7LPXVPicjKH2+bJv74has5xgTH01hm8/7c2NfKiktbzKPdmti83FPf+h6WYDKNV2fraC10z1dzVY/SV2mWcBWR1dRSEq/+a2jO3J8bn1R2wFy5vcHJvL+u3HryZne2OoMyJ8t+SxvLqjuzc9R0aakrv6eE7LlwaeyzIffdnFPSZ1UW8JXsPn9AKIgAAggggAACCCyUgCNcqZMDX3jhhXXr1k3XGOvdL65caZHBACVIp4+JjNFqtRrFJm8Q7utuv9rRclUGCeJ37349xkvKWr75ShmwpdZcLRnPIs6WlurSVkXZ/vFxp4OaaPvg1kS+8qXMdPfHj+MSjJWQaLl941tIWIjEkQHZre6zdZSW3lOMF5zbrChhyZlGpbG5zqLEuwz1THttjd5csr94R25hs6k8YdrSHgsEGVNHQ4FWHyKtNxntnzONTidJSgbRJraBqvSYKqe/I5KP1px1fvIrLCFhIuLZrBaLbWN0kNLYeqtfiZ/8xoSp76nHBi7gTvn8XFeCkkvyx8KVvSna+PyS5PK9jaXVlozx0SnZbbxwaSxc2QuFbTdFK42dHf02RU2cemOCTqmqyyuq/TgnLXbs32VNmJd/qRew21waAQQQQAABBBBYRALq/EDJV6tWrZKf8torb42z3r1y7mLHQEjsjoNJhkinL+bjJ9ms3e1NV65e/Ohu7N5Du6PlG7CnbWGev/LUkjnft/F4kadwZX8o67o80bQxdfy7qqcr+1LG03kz3KdPy9kedL8ou7TZIolKHvsqOlJ+P2h71nbHt+j+W60SWloPvDr5AaJX9soMt4F+q5zl96ZJKDhuHKjKLbdfcx63oOSTF2vs29WrTe1fffu4s7bIdUjR2ll7OvuNRPuDZeteiYlPL2yW/2hgs6fLyduU99S14EL/rX5+9MkunzFtbLIERvn8OfcsSB7ec9o0IfY/pf9jmyah/GbNUaPl1N7XXlkdGZ9mLqhs5uGrhb7BXB8BBBBAAAEEloKA+vCVJCtZLGGamYG27i9+d7FDiT/423d2v+YpXEl/NdrI13a/89uD8UrHxd990e32VVUlWcb5ShSXwm1X26hPLcjQDV0v2hEji1Ksi0kvs6Z+fG3ssSqnXhg/brrpvl07PsNlAPUZJzN1t4oLmodCpsjfcyGo1SekjG4JCbJ+h9vTXjIylxhp2HvKottfUmPPX4+t3VfNjiflJrVgSd3TubBT6whLKWro/varposf79+itFa8tcOwLjLttOsSIHN3PWpCAAEEEEAAAQSWuIA6ciWdUCOW/Ol98Kqv6UrLUPTeX70e6fZl1VVCE/n6r/ZGD7XUNPW5Hhr9exnnK4/9te/U6Izy7d1lEMGltC9lprzAVAe+72r+n81d33s4bOsoPlARcry97/G3X3/11VdfP5YF5cyTxj500VskAlnuaaJj3Ta30DL6rJN9yuF0m0ZmrG23XS4s758cRu0VyNN8k063TlfZzI73Vb5V2KrZc7W99rQ5zVP+8rNan/s+Tb2zq0ej324fqGrsmKxm7WiUoVM5Nu2/uO6t04TJ2FVReW1H39dNJ7dbG/P2Fnpdp9G9BvYggAACCCCAAALPl4D6ZmH5KfnK6/hVX3v7QMiO12N8/I6miXl9R4i1o91jwHoe85WiT07TyfrlhZUTC5/Lkn0Vt5w/b76U8evzOfSvxaZfHnvv2C9Nxf86MflrrIr+1oZ+Rb9Fp5Fhx7AwT8/WaIw5mRFKf9mBIpdhC6v73ECN1KQot+p8+fodllqSFXGrzL6YvWPT6GLl/M7q646PjK3j9N7cST5+9d1LYVv/PVnQLyR2fDVBe9G+5irhmNnmT9+9XWG29ehNOUZlqFFWaJxIqbaW4tzGIcWYM2n9QG+tUDUqayfPB9TGmzLkcTurZaZI012S4wgggAACCCCAwNIXkEwlnXBMEZT3Q03dJ1vfkKL19B18qlPshW3yvLyH7Vmvb/G45//d7X38RFrypPeB/OP7f2tvU1bLn6te+vm/i/mJl157aPxMd2li88/uqU6/fCQ2vjlrf4Ju6FZdeWWrIs86TazL4EsZ+YrbUdt4a/QxIdute5Ka+purK+31yBhZQmrKpFGKbyy3/zza4D/ftnyj/NUrkxqvs69hIGucZ+fsN0rIslcgU+F0G2ONseOVaGILak+2xuedSoq5vj9zf4KMblk7mquqqjqMNd2XJj1pponNytpYUViRvsOWI93T2Ppbmzu3FJWbJy8WobZAE5tTklq2V96YNTElLzrVvFHW1juQmNZsTtD3d9RV1/UZzam6soZJrZ6LPzTRe6TeisK0N6w5slqGTRYoL6tqDbEP1s1o86vvXq4w63rCMi6crY5/qywppiMzyzS2Pvv1/pDksxf8WrCxs7rwrbzW7I3JGRmpCRtlJZEBi6zBeOqWEpGZb/TxP7F46SeHEEAAAQQQQACB5Sugjlypo1grV3oZWNJog5Ruq0w88nUJMass5qzxPCXpGeerkQf/5/ef3nRKet//S+3Ffxm9p+F7/8uzylcil1J5syb2SGHZ5VN5lxXdlj1Ha++mdOww5Dl9vHwoY+useOtAhdNief1VeW+pS+VtOftVyqTv0a9s/4e/qXrvf/WF/c0/bJ8cruSamuiso8aKvNaKwtaKyZ9w3Z4LNyvV11tporMbuo3lhcVldWVHqkZXjdySkPpxbb77Mh7ROVdrrG/lVlQVHpHmSFAzJm+xzxf0+H1cm3a8YGOd89r0SnRO7cV+c255Y0VhY5DOaMq/VmPWVsbOQ75SNPEl1y5ozAXlp47U2XuUsr/k5qXkgezIHbLG40w2v/ru5QKzrkefUduuP11YVFZdeETuakjE9v0nLxzPdlor0cvVHYeis699nVBbUVYhC9tX9NtHPuV2JmR+fPa42WlpQl9qogwCCCCAAAIIIPB8CKgrs/vT1zBDTFDHFzf7DK+H+XRa382mAU20QQr/6FZ+hTpw5rZfaWhoSEiY4brd7rXNbE9vb294eLjv58rSemrhx48f+37WIijZV5kWkz1Q0nLV7HgJlIyI9bWU792R16o/ebfDaTXvRdBcmoAAAggggAACCCCAwKIVUPOVDFsNDQ09evTou+++k5jw6quvemuw7e65f7rYp0//1SHDdGNY1vZzv6uxhL2RdzA66KlcZXh4ODg42FG5l2Eybw3g2FwKWK9XNA7p96c4hSupXhMWrbdP2dPZ3y7FhgACCCCAAAIIIIAAAn4JrFixQsrLT5kcKEHI27mamEP/uENrqfnow9/fnGrldXk46O7Nc//toxqLdsc/HppqMYxnPD/QW6ee32Na454tSt6RtIyBAvP2LfaVHqyWW6115acqrmtSz5anTZehn185eo4AAggggAACCCCAgBcBNVzJ+4VHRka8LiGoKGGv/eq3YV98eP7qF3djfmXQ2vru3pWHrOT5GpvNau2Tv+5aBoaUEP2Og3//mizjPrp8hoeNfOUB5Znv0mdfuxkmz1WVZ+8oVJ/msj9alXz0oixZPnlQ65k3jQsigAACCCCAAAIIILA0BdRwJYNXsslcwWny1WgfZcWCEEOkpu/m789d7ZxY9DsoRBcW+Vq6wWCInG7og3y1OD4smlhTySVTyeJoDK1AAAEEEEAAAQQQQGCJC7hMDpQHsdauXeu9T303r3TIotrt5/7p6kCQPvHg3ydN/7phtxrJV24k7EAAAQQQQAABBBBAAIGlLyDDVi+MbrL6ndf3X9m7ams/d+6afSpZv02buPdQUsx0A1VT+JCvpoBhNwIIIIAAAggggAACCCxBAXXkSlYRlF9kk3w1ODgoSwjKENaaNWum6pBNiXxthyEyJiZS6/GdRlOd57p/WeWrpbYsu+vN4G8EEEAAAQQQQAABBBCYKwE1XMniFoGBgVLnwMCAl3ylNSQlzcWFWZ99LhSpAwEEEEAAAQQQQAABBBaTgDp4pU4RlMmB33zzjSwhKKtczHcbyVfzLUz9CCCAAAIIIIAAAgggsAACjvmB6hCWTBSUlDXf7SBfzbcw9SOAAAIIIIAAAggggMAzFVCTlVzSMX4lUwQlXMlbhr/77rvZN0Ue7pJK1J8utS3256+medHy7G2oAQEEEEAAAQQQQAABBJadgIQfdZOeScpSh7BkoQvJFxqNZtrlBKf18BiuhoeHF3u+evLkybR9owACCCCAAAIIIIAAAggg4CzgyFcSqNQxG5kfKEsIykDW119//dOf/lRC14zFHOHKOWXJVXp7exd7vrLZ5B3KbAgggAACCCCAAAIIIICAfwJqxJLY88Popqasvr6+sLAwiVg6nW42EUuaIhU68pUaroKCgqbMV7MfMvOv91OUlog5xRF2I4AAAggggAACCCCAAAJTCjgPYUm+kplx8lN2/uEPf9Dr9TLWtH79+pmlHjVWqfXLs14yLVBqk5ExebhrynwVERExZUuf4QGZIvkMr8alEEAAAQQQQAABBBBAYPkIOIaw5BdHxJLudXZ2btiwQXauXbv2xRdfnEGH1Ygldcqa7xKrZPLhn//8Z3m/1gr1gMcaJdj19PTw0l6POOxEAAEEEEAAAQQQQACBRS6ghh11ZqBjoqC8CEs2GXTSarUGg0GWvnjppZdmMJAlA2Lff/+9jF9JxJKgJctmyOYtXy1yLJqHAAIIIIAAAggggAACCHgXUIew5KfkKzViSS6ScCWbrPUgg0nr1q2TPZKyJG7JAJT32tSj8hCT1WpVz/r2229Xr14tyUrmB8pGvvIFkDIIIIAAAggggAACCCCwVAXcI5YMN0m+klEsGXpSs9bLL78sO2WPJCXJS7JShYQlxwIYEsykmBSWPCapTAa7ZELgn/70JzVTSWHZI7/LTvLVUv2U0G4EEEAAAQQQQAABBBDwUcBjxFIDlWO6oCQoiUkyhCUxSQKV7FcnFsolJGjJIfkpp8jglRxSk5XsVDc5Rd3IVz7eEYohgAACCCCAAAIIIIDAUhVQH8RSU5akJnW6oIQl2dR1BeWnY1OnEarF1A7LQ1YSriRBqa8qlsmE6iZ75Bc1WckhKTbl+oFLVY52I4AAAggggAACCCCAAAKTBST5SKaSn7JbgpAat9TUJH9KlJKYpMYqSVzqL2oZRzVSWM1XjjTl+FNNVuQrPnQIIIAAAggggAACCCDwvAio4crRWzVlyU41eklYkliljlmpP9XBLrW8WkzNY+pPOd05VjkKMD/wefk80U8EEEAAAQQQQAABBBAQAee5gmqIcvmp5it3K0e+mkhTk3+TU8hX7m7sQQABBBBAAAEEEEAAgeUs4IhY0knncOX4U/3FmUCSlPyp5inHL85/qjt5/mo5f27oGwIIIIAAAggggAACCLgLqGHJsV/+dElccsj9+SvZ6UhZHn+373Q5zf3a7EEAAQQQQAABBBBAAAEElquAIxA5RywvnXWOWFLMNaqRr7zYcQgBBBBAAAEEEEAAAQSeEwHnZOQxJTlHKZdY5SBifuBz8mmhmwgggAACCCCAAAIIIOBNYKrI5Dhn2gJSkvmB3og5hgACCCCAAAIIIIAAAgj4LrDS96KURAABBBBAAAEEEEAAAQQQ8CJAvvKCwyEEEEAAAQQQQAABBBBAwA8B8pUfWBRFAAEEEEAAAQQQQAABBLwIkK+84HAIAQQQQAABBBBAAAEEEPBDgHzlBxZFEUAAAQQQQAABBBBAAAEvAuQrLzgcQgABBBBAAAEEEEAAAQT8ECBf+YFFUQQQQAABBBBAAAEEEEDAiwD5ygsOhxBAAAEEEEAAAQQQQAABPwTIV35gURQBBBBAAAEEEEAAAQQQ8CJAvvKCwyEEEEAAAQQQQAABBBBAwA8B8pUfWBRFAAEEEEAAAQQQQAABBLwIkK+84HAIAQQQQAABBBBAAAEEEPBDgHzlBxZFEUAAAQQQQAABBBBAAAEvAuQrLzgcQgABBBBAAAEEEEAAAQT8ECBf+YFFUQQQQAABBBBAAAEEEEDAiwD5ygsOhxBAAAbc3DYAAAyeSURBVAEEEEAAAQQQQAABPwTIV35gURQBBBBAAAEEEEAAAQQQ8CJAvvKCwyEEEEAAAQQQQAABBBBAwA8B8pUfWBRFAAEEEEAAAQQQQAABBLwIkK+84HAIAQQQQAABBBBAAAEEEPBDgHzlBxZFEUAAAQQQQAABBBBAAAEvAuQrLzgcQgABBBBAAAEEEEAAAQT8ECBf+YFFUQQQQAABBBBAAAEEEEDAiwD5ygsOhxBAAAEEEEAAAQQQQAABPwTIV35gURQBBBBAAAEEEEAAAQQQ8CJAvvKCwyEEEEAAAQQQQAABBBBAwA8B8pUfWBRFAAEEEEAAAQQQQAABBLwIrPJybLEfGrzXcO58zY2uhyOKEhCij9v15r6dhvWBLs12KhUQGrU1/eChlI3Bc9+1Bw3vZFX0eqg3PLPsw5T1Hg543/Wg9UqjddPueWmr9yt7OGq9feXcZ5+3dQ2I9JrQTXHp++YH0cOl2YUAAggggAACCCCAwBISWPH06dMl1FxHUwfvXTnx/mddin5rctKmyICR7o76+raHAbG//iR/m3aiQ4O3z+W9X/8wNG5nelyk0t1WI4VCd7578tDmuY5Ywz2tTbetcuWR3muf1XcpUTv3JYYHyN/azUnGDa6hb1rywdb8X5YOHiz7cJf/0Wzayv0rMNj+4eETN5RNO/ftjNsQaO3+svr8td7w9JMn34z0u1v+XZnSCCCAAAIIIIAAAggsMYGlOn4VELwmNO7gQfOu8bGolOSt5YcLGs9d6Y47NP69f7j7ypn6hyHJ7540q3kqaVtc+Tvv15+5kvTBXIeDwA3GlA32mz/c3XvFnq+SklNmkT8GHw4ukk+S9ctPbzwK3Xny2BjrZoNhw6PDJ+qru0y5mwlYi+Qu0QwEEEAAAQQQQACBxSEw1/nK2pD7VoX2WFW+YaS7qfrTK013eh/JpLKQxHc/OTzxbVwGe6o/q25qt4xOOAuPTTId2rdt0iDP8IP2+s+qG9tGJ//JnLSozXHJ6buTNo6PTQVuSHnnnUmEgZGJhjWN17p6RpSxXDPc29j0UIl62zQxWBW82fRmVOOZpsZek3ly+pEsduK9z+4om/a9d2z3LIKRt9vac+Xwb6qj3v0f72y4c6X80yttlkcio99qOvz2eBQbvl38y/c77F0e3c5nvXHeUeGa5KKz5o0+Jprh9uL9Jzo8NibWfnd8rEYqGHwordRGhU6cERy5NVRps1qlmb5X47Ep7EQAAQQQQAABBBBAYHkJzHW+GtV5ZO1uKjlxpm0kfFNckkErX9GHDRPpabC9PO9E48PQ2J2m5KjQ4MHuG/U1H/3mdvfJYseY0mDrR3mlbSP6RNPbceFrlEcP77TfaDpT0KWcLU5ymv3ncitG7EluzcTEP+u9ewNK+K5Nk87QRsWFK5/du/doPIWplQx3Xam+I0FCuVN9pWvnO/M3MDPS03Y+/0STdVNS8r5tMmOx8caNiveU9Z+8Y7A3PDDyzWPHdo0owz1XPjp/Z03i24cnZjsGhPoxzTAw6lDRu/YE5LYFaP1Lj6FxhpCaxiuN3XHjsXOwp+2hEm6Kmusplm5NZQcCCCCAAAIIIIAAAktMYF7yVfenJ84omzM/+HWKeyIYbD1T0vgw6uAH7+4aO7htW9K2c7/Jqym/sWssPQ13NbY9UvSZ+YdTxrKRMWnXIbN1OHjqcKUMtte3jwTEJju+9g8/ssjzUBvCJQYMtn+Sd+Kaknjs5GGDNlwq6bFYhxWt0+hLYGhUuHLDIncv3HmoZh7upqW+PTnnE7NR7crObaHS9bbPu4YNo2NKwZGbDfKPYW3bGuVOcOQmg2GGz18Fb9i4eXS+4my3wI0H3zvYk38+7/DtnW+adsZpu8582B6ZWZw8w4bNtj2cjwACCCCAAAIIIIDA4hWYl3w1MhC6z2O4UhTrl9VtIyHJ5mTn5BUYnrw1tL6mqXswSR3FCY+LUjq6as41bNiXtHl8RcBg5zzkKiqDYjJepj9oNk6MqgwPy/hNQGCAMtzb1v5Qzmhvk4G0cNmhjAwPu9SwflfxBxvaugMi4zbPb3AI2Zl7cCxcSRMCw+M2ram59vCBRMG5vu6wWyftfQ4M9HdSX2BwpCEqpKu798szBfX2KvTpRZNnc7reDf5GAAEEEEAAAQQQQOD5FJiXfLVm66Gd7iNXduDhnju9km4a8/Y3unuvccxnW59y7KRy7pPzFe/fqJBF1ePitm5N2maInCpg2Z+dOnFjJPbt3Emr7QWqSWpEZt2lm9MHbyhb02Vq3OBo6gp2DxmBGwzb5mTEx71nznsCA+yrCk5sgcH2vz3N5PNej/ejc/b81XDP5+/nnR9MOvZJvn2mZ09705VPP6spONyVWXzMw/Ck91ZxFAEEEEAAAQQQQACB5S0wL/lKcYkQroQhyTn5HqaXrQmdmP4XHJlyuCTlUM/t9va2Gze+PF9afz5An/zrXLPRdZRnuKfhRP5nXTKmkps06VigVi/1WXtlIb7g9cY33zGqzbD2ylCRdsqs5trWJfp34CbzySKZA+m+Bfr1/NWDpg/Pd4VnnjVLuJIteINh12HDNkPx4Y8qPmwylozP33S/DHsQQAABBBBAAAEEEHgOBeYnX00JGRi6yf6YU8+INjJyIkxNWTx4w+Zt8r9dh4YftJa/X9pY+lHU5BUuhrs/P5F//k54etF7b7otrbdm4+YQpb7tjnX3+olrWbvaepWQnZFr3C46/OB2e7cSaXBMSHQr8Sx3jA5qjcx8WCtwfeRG1yjqf/uHH3b0Kmu2umhp43bZ52929AynTDWk6P+lOAMBBBBAAAEEEEAAgaUvsPIZd2G9cXdsgHKnvPzLB5OuPDw4ODHYMtzzZdM9GWWa2ALXG5I3SSQatDq9FkpeHpyfd74rcp+sPDj+GqxJJ0UmJ4UrXZ9W33acNHi7+tMuWVMw2W0NvQcN72e9X1oqPxomt+wZA6mXk+U2JBw9bOuapPDsmxKojdIqj+7c6J40Ejbc3d5jX7OdcPXs7whXRAABBBBAAAEEEFjUAs94/Eq+lW87nNuef+LaR1mHm3bu3BalDRiRIaXGxjbF9MGHu0effxq8U33uzI2BT6O2Jm01RIXKt3hrT0dT9bVHAZsObR17Qmr4wZdn8j+6MRAQtTNR232jqduhHCArsI+vBr/B9PbOtoL69/Os6elbI5XuGzU1bQPhO3N3uj1nNfzwzlgV3XceDqeMr6nh+72zZ8L20QfIHnY9kp9dTY2fh8o/A7SGpBksBhFsSI8LKG07k1/SvTMuMnh4sKerayTJfGjiVV6+N202JTckmxM/P1FfkPcwfXeiQSZwjjzs+vJK9bWBkK3Hkt0UZ3MlzkUAAQQQQAABBBBAYOkLPPN8JQnLcPiDsrj6z2oam86fsa9HJ28PNphyTI7QE2x455Oy5Kb6+ms3qs/U2MPKaJF9Oft2jj99Ndwty7kPyIGRrvqKrsm3Ifxg2YcbxqbGBW48VFwUfu58TX1F24giS2UkZv76UIrbTEJZVS9qtymq/bMuJcq0O8p97Yvpb7T1dnXFeVm7Y3zrqj+vtis8QJbN8H+mXrDx1yffLj/z6Y36822S0taEhkdtncV0wek7MEWJYMPhD09urq7+vKm8tMYeHwNCwjclZZr3pXgaMpyiEnYjgAACCCCAAAIIIPB8CKx4+vTp89FTeokAAggggAACCCCAAAIIzK/As37+an57Q+0IIIAAAggggAACCCCAwMIJkK8Wzp4rI4AAAggggAACCCCAwPISIF8tr/tJbxBAAAEEEEAAAQQQQGDhBMhXC2fPlRFAAAEEEEAAAQQQQGB5CZCvltf9pDcIIIAAAggggAACCCCwcALkq4Wz58oIIIAAAggggAACCCCwvATIV8vrftIbBBBAAAEEEEAAAQQQWDgB8tXC2XNlBBBAAAEEEEAAAQQQWF4C5KvldT/pDQIIIIAAAggggAACCCycAPlq4ey5MgIIIIAAAggggAACCCwvAfLV8rqf9AYBBBBAAAEEEEAAAQQWToB8tXD2XBkBBBBAAAEEEEAAAQSWlwD5anndT3qDAAIIIIAAAggggAACCydAvlo4e66MAAIIIIAAAggggAACy0uAfLW87ie9QQABBBBAAAEEEEAAgYUTIF8tnD1XRgABBBBAAAEEEEAAgeUlQL5aXveT3iCAAAIIIIAAAggggMDCCZCvFs6eKyOAAAIIIIAAAggggMDyEiBfLa/7SW8QQAABBBBAAAEEEEBg4QTIVwtnz5URQAABBBBAAAEEEEBgeQmQr5bX/aQ3CCCAAAIIIIAAAgggsHAC/x93XUNKVV4HZAAAAABJRU5ErkJggg==)
In the above case, you can see that we have created an RDD that contains 1 to 4 and it has 8 partitions and after applying filter transformation also the number of partitions are the same. Which means there are few partitions with empty contents. So in these situations, you can go for coalesce() to reduce the number of partitions as shown below.
val rdd2 = rdd1.coalesce(2)
rdd2.getNumPartitions
In the above case, you can see that we have created an RDD that contains 1 to 4 and it has 8 partitions and after applying filter transformation also the number of partitions are the same. Which means there are few partitions with empty contents. So in these situations, you can go for coalesce() to reduce the number of partitions as shown below.
val rdd2 = rdd1.coalesce(2)
rdd2.getNumPartitions
rdd2.collect
Case:-2
As
we know spark has two kind of operation (wide - eg join, all by key
operation and narrow - eg map, flatmap, filter).suppose have 2 TB data
and divided the data into 164 MB of block and doing narrow operation
again and again on the data,eventually getting small block files if we
do any shuffle operation that time performance will be degrading since
data need to travel between the nodes. so that is reason need to do the
coalesce before doing the shuffle operation.
Case:-3
Case:-3
While
operating on DataFrames and saving it as a CSV file, since the
DataFrame consists of many partitions, the output will be of multiple
files, but coalescing it into 1 partition would result in a single CSV
file.
No comments:
Post a Comment