Notes from November 17th, 2023
Weekly meeting of the TDP contributors to discuss opens PR and particularly the ones related to the TDP 1.1 release.
Pull Requests
Weekly review of open PRs (in chronological order):
-
hadoop#5: Waiting for contributor answer.
-
tdp-collection#799: Patch for hive transactions and compactions on posgresql backend is still a WIP. Not all oprerations are functionnal.
The patch will not garanty stable behaviour, as many known issues are only corrected in Hive 4. This feature will be best effort only. As discused, we will focus to bring transactions and compactions with Hive 4 in a future release. -
tdp-collection#816: Merged.
-
tdp-website#61: Merged.
-
tdp-observability#59: Promtail section should be checked. Spark is missing.
-
tdp-collection#819: Can be merged after resolving conflicts.
-
tdp-collection-prerequisites#97: To be reviewed by @PACordonnier.
-
tdp-getting-started#296: To be reviewed by @PACordonnier.
-
tdp-getting-started#297: Merged.
-
tdp-getting-started#298: Merged.
-
tdp-collection-prerequisites#99: Goes along with tdp-observability#60, to be reviewed.
-
tdp-observability#60: To be reviewed by @rpignolet.
-
tdp-getting-started#300: Merged. It is ok to fix the Python version for master. New branches should specify the minimal Python version and mention what versions has been tested.
-
tdp-observability#61: PR is still as draft, it can be merged once marked as ready.
TDP 1.1 release related PR
-
TDP#90: For now, build have been tested for :
- Hadoop 3.1.4
- Tez 0.9.1
- Hive 1.2.3
- Spark 2.3.5
- Hive 2.3.9
Some issues remain for:
- Spark 3.2.4 see spark#3
- HBase Connectors, which can’t be build because it needs Spark 3 jars
- HBase 2.1.10 is not compatible with Hadoop 3.1.4 because of a dependency upgrade (jetty) that is not backported on 2.1.10 (EOL and shaded in later versions). @Pierrotws is working on a PR to upgrade jetty on HBase 2.1.10.
-
spark#3: Issue with
mvn install
(see spark-old#3). We need to try to buid Spark 3.2.4 with and without this PR to see if it improve the test output. Spark 2 is ok. -
ranger#4: HBase plugin performance improvements that has not been released by Ranger but could be used. Merged.
-
phoenix#5: Merged. Build can be tested now.
-
phoenix-queryserver#5 Merged. Build can be tested now.
-
knox#3: Merged. Build can be tested now.
-
hbase-connectors#3: Merged. Build can be tested now.
-
hbase-operator-tools#3: Merged. Build can be tested now.
Open topics
- @sergkudinov, related to this discussion: PR can be opened to reflect the suggested updates.
tdp-cheatsheet
repository visibility has be set to private. - @PaulFarault, about the future of the
tdp-getting-started
repository:tdp-getting-started
will be archived along withtdp-vagrant
. Content oftdp-getting-started
dev
branch andtdp-vagrant
will be imported totdp-dev
.tdp-dev
will still contain Git submodules and thesetup.sh
script will be splitted in several utility scripts. - @PaulFarault, about repository names: Discussion about how to name the collections repository should be updated in order to take a decision.
- @PACordonnier: JMX exported metrics names are not user friendly. It would be nice to have a way to rewrite them using a pattern. See this article. A PR should be opened to discuss this topic.